o 1.0): If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751) min_tokens_to_keep (`int`, *optional*, defaults to 1): Minimumber of tokens we keep per batch example in the output. From: https://gist.github.com/thomwolf/1a5a29f6962089e871b94cbd09daf317 r