en, src_len)* where padding elements are indicated by very large negative values. encoder_hidden_states (`tf.Tensor`): cross attention input to the layer of shape *(seq_len, batch, embed_dim)* encoder_attention_mask (`tf.Tensor`): encoder attention mask of size *(batch, 1, tgt_len, src_len)* where padding elements are indicated by very large negative values. layer_head_mask (`tf.Tensor`): mask for attention heads in a given layer of size *(decoder_attention_heads,)* cross_attn_layer_head_mask (`tf.Tensor`): mask for heads of the cross-attention module. *(decoder_attention_heads,)* past_key_value (`Tuple(tf.Tensor)`): cached past key and value projection states NrI