are not taken into account for computing the loss. end_positions (`torch.LongTensor` of shape `(batch_size,)`, *optional*): Labels for position (index) of the end of the labelled span for computing the token classification loss. Positions are clamped to the length of the sequence (`sequence_length`). Position outside of the sequence are not taken into account for computing the loss. is_impossible (`torch.LongTensor` of shape `(batch_size,)`, *optional*): Labels whether a question has an answer or no answer (SQuAD 2.0) cls_index (`torch.LongTensor` of shape `(batch_size,)`, *optional*): Labels for position (index) of the classification token to use as input for computing plausibility of the answer. p_mask (`torch.FloatTensor` of shape `(batch_size, sequence_length)`, *optional*): Optional mask of tokens which can't be in answers (e.g. [CLS], [PAD], ...). 1.0 means token should be masked. 0.0 mean token is not masked. Returns: Example: ```python >>> from transformers import XLMTokenizer, XLMForQuestionAnswering >>> import torch >>> tokenizer = XLMTokenizer.from_pretrained("xlm-mlm-en-2048") >>> model = XLMForQuestionAnswering.from_pretrained("xlm-mlm-en-2048") >>> input_ids = torch.tensor(tokenizer.encode("Hello, my dog is cute", add_special_tokens=True)).unsqueeze( ... 0 ... ) # Batch size 1 >>> start_positions = torch.tensor([1]) >>> end_positions = torch.tensor([3]) >>> outputs = model(input_ids, start_positions=start_positions, end_positions=end_positions) >>> loss = outputs.loss ```Nr