rope_theta (`float`, *optional*, defaults to 10000.0): The base period of the RoPE embeddings. rope_scaling (`Dict`, *optional*): Dictionary containing the scaling configuration for the RoPE embeddings. Currently supports two scaling strategies: linear and dynamic. Their scaling factor must be an float greater than 1. The expected format is `{"type": strategy name, "factor": scaling factor}`. When using this flag, don't update `max_position_embeddings` to the expected new maximum. See the following thread for more information on how these scaling strategies behave: https://www.reddit.com/r/LocalLLaMA/comments/14mrgpr/dynamically_scaled_rope_further_increases/. This is an experimental feature, subject to breaking API changes in future versions. bos_token_id (`int`, *optional*, defaults to 11): The id of the "beginning-of-sequence" token. eos_token_id (`int`, *optional*, defaults to 11): The id of the "end-of-sequence" token. Example: ```pytho >>> from transformers import FalconModel, FalconConfig >>> # Initializing a small (2-layer) Falcon configuration >>> configuration = FalconConfig(num_hidden_layers=2) >>> # Initializing a model from the small configuration >>> model = FalconModel(configuration) >>> # Accessing the model configuration >>> configuration = model.config ```Z