e_features (`torch.FloatTensor` of shape `(batch_size, output_dim`): The image embeddings obtained by applying the projection layer to the pooled output of [`GroupViTVisionModel`]. Examples: ```python >>> from PIL import Image >>> import requests >>> from transformers import AutoProcessor, GroupViTModel >>> model = GroupViTModel.from_pretrained("nvidia/groupvit-gcc-yfcc") >>> processor = AutoProcessor.from_pretrained("nvidia/groupvit-gcc-yfcc") >>> url = "http://images.cocodataset.org/val2017/000000039769.jpg" >>> image = Image.open(requests.get(url, stream=True).raw) >>> inputs = processor(images=image, return_tensors="pt") >>> image_features = model.get_image_features(**inputs) ```NrT