l`, *optional*, defaults to `True`): Whether or not the model should return the last key/values attentions (not used by all models). Only relevant if `config.is_decoder=True`. emb_layer_norm_before (`bool`, *optional*): Whether to apply layer normalization after embeddings but before the main stem of the network. token_dropout (`bool`, defaults to `False`): When this is enabled, masked tokens are treated as if they had been dropped out by input dropout. Examples: ```python >>> from transformers import EsmModel, EsmConfig >>> # Initializing a ESM facebook/esm-1b style configuration >>> configuration = EsmConfig() >>> # Initializing a model from the configuration >>> model = ESMModel(configuration) >>> # Accessing the model configuration >>> configuration = model.config ```Z