to the forward pass. `doc_scores` can be computed via `question_encoder_last_hidden_state` and `retrieved_doc_embeds`, see examples for more information. context_input_ids (`tf.Tensor` of shape `(batch_size * config.n_docs, config.max_combined_length)`, *optional*, returned when *output_retrieved=True*): Input IDs post-processed from the retrieved documents and the question encoder `input_ids` by the retriever. If the model has is not initialized with a `retriever` ``context_input_ids` has to be provided to the forward pass. `context_input_ids` are returned by [`~RagRetriever.__call__`]. context_attention_mask (`tf.Tensor` of shape `(batch_size * config.n_docs, config.max_combined_length)`, *optional*, returned when *output_retrieved=True*): Attention mask post-processed from the retrieved documents and the question encoder `input_ids` by the retriever. If the model has is not initialized with a `retriever` `context_attention_mask` has to be provided to the forward pass. `context_attention_mask` are returned by [`~RagRetriever.__call__`]. use_cache (`bool`, *optional*, defaults to `True`): If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see `past_key_values`). output_attentions (`bool`, *optional*): Whether or not to return the attentions tensors of all attention layers. See `attentions` under returned tensors for more detail. output_hidden_states (`bool`, *optional*): Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors for more detail. output_retrieved(`bool`, *optional*): Whether or not to return the `retrieved_doc_embeds`, `retrieved_doc_ids`, `context_input_ids` and `context_attention_mask`. See returned tensors for more detail. return_dict (`bool`, *optional*): Whether or not to return a [`TFRetrievAugLMOutput`] instead of a plain tuple. n_docs (`int`, *optional*, defaults to `config.n_docs``) Number of documents to retrieve and/or number of documents for which to generate an answer. c