ompts are: * quality: "Good photo." vs "Bad photo." * brightness: "Bright photo." vs "Dark photo." * noisiness: "Clean photo." vs "Noisy photo." * colorfullness: "Colorful photo." vs "Dull photo." * sharpness: "Sharp photo." vs "Blurry photo." * contrast: "High contrast photo." vs "Low contrast photo." * complexity: "Complex photo." vs "Simple photo." * natural: "Natural photo." vs "Synthetic photo." * happy: "Happy photo." vs "Sad photo." * scary: "Scary photo." vs "Peaceful photo." * new: "New photo." vs "Old photo." * warm: "Warm photo." vs "Cold photo." * real: "Real photo." vs "Abstract photo." * beautiful: "Beautiful photo." vs "Ugly photo." * lonely: "Lonely photo." vs "Sociable photo." * relaxing: "Relaxing photo." vs "Stressful photo." Args: images: Either a single ``[N, C, H, W]`` tensor or a list of ``[C, H, W]`` tensors model_name_or_path: string indicating the version of the CLIP model to use. By default this argument is set to ``clip_iqa`` which corresponds to the model used in the original paper. Other available models are `"openai/clip-vit-base-patch16"`, `"openai/clip-vit-base-patch32"`, `"openai/clip-vit-large-patch14-336"` and `"openai/clip-vit-large-patch14"` data_range: The maximum value of the input tensor. For example, if the input images are in range [0, 255], data_range should be 255. The images are normalized by this value. prompts: A string, tuple of strings or nested tuple of strings. If a single string is provided, it must be one of the available prompts (see above). Else the input is expected to be a tuple, where each element can be one of two things: either a string or a tuple of strings. If a string is provided, it must be one of the available prompts (see above). If tuple is provided, it must be of length 2 and the first string must be a positive prompt and the second string must be a negative prompt. .. note:: If using the default `clip_iqa` model, the package `piq` must be installed. Either install with `pip install piq` or `pip install torchmetrics[multimodal]`. Returns: A tensor of shape ``(N,)`` if a single prompts is provided. If a list of prompts is provided, a dictionary of with the prompts as keys and tensors of shape ``(N,)`` as values. Raises: ModuleNotFoundError: If transformers package is not installed or version is lower than 4.10.0 ValueError: If not all images have format [C, H, W] ValueError: If prompts is a tuple and it is not of length 2 ValueError: If prompts is a string and it is not one of the available prompts ValueError: If prompts is a list of strings and not all strings are one of the available prompts Example:: Single prompt: >>> from torchmetrics.functional.multimodal import clip_image_quality_assessment >>> import torch >>> _ = torch.manual_seed(42) >>> imgs = torch.randint(255, (2, 3, 224, 224)).float() >>> clip_image_quality_assessment(imgs, prompts=("quality",)) tensor([0.8894, 0.8902]) Example:: Multiple prompts: >>> from torchmetrics.functional.multimodal import clip_image_quality_assessment >>> import torch >>> _ = torch.manual_seed(42) >>> imgs = torch.randint(255, (2, 3, 224, 224)).float() >>> clip_image_quality_assessment(imgs, prompts=("quality", "brightness")) {'quality': tensor([0.8894, 0.8902]), 'brightness': tensor([0.5507, 0.5208])} Example:: Custom prompts. Must always be a tuple of length 2, with a positive and negative prompt. >>> from torchmetrics.functional.multimodal import clip_image_quality_assessment >>> import torch >>> _ = torch.manual_seed(42) >>> imgs = torch.randint(255, (2, 3, 224, 224)).float() >>> clip_image_quality_assessment(imgs, prompts=(("Super good photo.", "Super bad photo."), "brightness")) {'user_defined_0': tensor([0.9652, 0.9629]), 'brightness': tensor([0.5507, 0.5208])} N) r8