model_name_or_path: string indicating the version of the CLIP model to use. Available models are `"openai/clip-vit-base-patch16"`, `"openai/clip-vit-base-patch32"`, `"openai/clip-vit-large-patch14-336"` and `"openai/clip-vit-large-patch14"`, Raises: ModuleNotFoundError: If transformers package is not installed or version is lower than 4.10.0 ValueError: If not all images have format [C, H, W] ValueError: If the number of images and captions do not match Example: >>> import torch >>> _ = torch.manual_seed(42) >>> from torchmetrics.functional.multimodal import clip_score >>> score = clip_score(torch.randint(255, (3, 224, 224)), "a photo of a cat", "openai/clip-vit-base-patch16") >>> score.detach() tensor(24.4255) r