le features from the last Pixel Decoder. query_position_embeddings (`torch.FloatTensor` of shape `(num_queries, batch_size, hidden_size)`): , *optional*): Position embeddings that are added to the queries and keys in each self-attention layer. encoder_hidden_states (`torch.FloatTensor` of shape `(batch_size, encoder_sequence_length, hidden_size)`): Sequence of hidden-states at the output of the last layer of the encoder. Used in the cross(masked)-attention of the decoder. feature_size_list (`List[torch.Size]` ): This is a list containing shapes (height & width) of multi-scale features from the Pixel Decoder. output_attentions (`bool`, *optional*): Whether or not to return the attentions tensors of all attention layers. See `attentions` under returned tensors for more detail. output_hidden_states (`bool`, *optional*): Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors for more detail. return_dict (`bool`, *optional*): Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple. Nr%