(masked), the loss is only computed for the tokens with labels in `[0, ..., config.vocab_size]`. use_cache (`bool`, *optional*): If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see `past_key_values`). - 1 for tokens that are **not masked**, - 0 for tokens that are **masked**. output_attentions (`bool`, *optional*): Whether or not to return the attentions tensors of all attention layers. See `attentions` under returned tensors for more detail. output_hidden_states (`bool`, *optional*): Whether or not to return the hidden states of all layers. See `hidden_states` under returned tensors for more detail. return_dict (`bool`, *optional*): Whether or not to return a [`~utils.ModelOutput`] instead of a plain tuple. Returns: Example: ```python >>> from transformers import ( ... SpeechEncoderDecoderModel, ... Speech2Text2ForCausalLM, ... Wav2Vec2Model, ... Speech2Text2Config, ... Wav2Vec2Config, ... Wav2Vec2FeatureExtractor, ... Speech2Text2Tokenizer, ... ) >>> from datasets import load_dataset >>> feature_extractor = Wav2Vec2FeatureExtractor() >>> tokenizer = Speech2Text2Tokenizer.from_pretrained("facebook/s2t-wav2vec2-large-en-de") >>> encoder = Wav2Vec2Model(Wav2Vec2Config()) >>> decoder = Speech2Text2ForCausalLM(Speech2Text2Config()) >>> # init random speech2text model >>> model = SpeechEncoderDecoderModel(encoder=encoder, decoder=decoder) >>> model.config.pad_token_id = tokenizer.pad_token_id >>> model.config.decoder_start_token_id = tokenizer.bos_token_id >>> # pre-process inputs and labels >>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation") >>> inputs = feature_extractor( ... ds[0]["audio"]["array"], sampling_rate=ds[0]["audio"]["sampling_rate"], return_tensors="pt" ... ) >>> input_values = inputs.input_values >>> decoder_input_ids = tokenizer(ds[0]["text"], return_tensors="pt").input_ids >>> # compute loss >>> loss = model(inputs=input_values, labels=decoder_input_ids).loss >>> # backprop loss >>> loss.backward() # doctest: +IGNORE_RESULT ```N)