shape `(batch_size, sequence_length)`. use_cache (`bool`, *optional*): If set to `True`, `past_key_values` key value states are returned and can be used to speed up decoding (see `past_key_values`). Returns: `transformers.modeling_outputs.CausalLMOutputWithCrossAttentions` or `tuple(torch.FloatTensor)` Example: ```python >>> from transformers import AutoTokenizer, XmodForCausalLM, AutoConfig >>> import torch >>> tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-base") >>> config = AutoConfig.from_pretrained("facebook/xmod-base") >>> config.is_decoder = True >>> model = XmodForCausalLM.from_pretrained("facebook/xmod-base", config=config) >>> model.set_default_language("en_XX") >>> inputs = tokenizer("Hello, my dog is cute", return_tensors="pt") >>> outputs = model(**inputs) >>> prediction_logits = outputs.logits ```NF) r«