not in the vocabulary cannot be converted to an ID and is set to be this token instead. pad_token (`str`, *optional*, defaults to `""`): The token used for padding, for example when batching sequences of different lengths. mask_token (`str`, *optional*, defaults to `""`): The token used for masking values. This is the token used when training this model with masked language modeling. This is the token which the model will try to predict. additional_special_tokens (`List[str]`, *optional*, defaults to `["~~NOTUSED", "~~NOTUSED"]`): Additional special tokens used by the tokenizer. Z input_idsZattention_maskNú