proj_to_labels (`bool`, *optional*, defaults to `True`): Argument used when doing sequence summary, used in the models [`OpenAIGPTDoubleHeadsModel`] and [`OpenAIGPTDoubleHeadsModel`]. Whether the projection outputs should have `config.num_labels` or `config.hidden_size` classes. summary_first_dropout (`float`, *optional*, defaults to 0.1): Argument used when doing sequence summary, used in the models [`OpenAIGPTDoubleHeadsModel`] and [`OpenAIGPTDoubleHeadsModel`]. The dropout ratio to be used after the projection and activation. use_cache (`bool`, *optional*, defaults to `True`): Whether or not the model should return the last key/values attentions (not used by all models). Examples: ```python >>> from transformers import OpenAIGPTConfig, OpenAIGPTModel >>> # Initializing a GPT configuration >>> configuration = OpenAIGPTConfig() >>> # Initializing a model (with random weights) from the configuration >>> model = OpenAIGPTModel(configuration) >>> # Accessing the model configuration >>> configuration = model.config ```r