`float`, *optional*, defaults to 0.02): The standard deviation of the truncated_normal_initializer for initializing all weight matrices. layer_norm_eps (`float`, *optional*, defaults to 1e-12): The epsilon used by the layer normalization layers. attention_window (`int` or `List[int]`, *optional*, defaults to 512): Size of an attention window around each token. If an `int`, use the same size for all layers. To specify a different window size for each layer, use a `List[int]` where `len(attention_window) == num_hidden_layers`. Example: ```python >>> from transformers import LongformerConfig, LongformerModel >>> # Initializing a Longformer configuration >>> configuration = LongformerConfig() >>> # Initializing a model from the configuration >>> model = LongformerModel(configuration) >>> # Accessing the model configuration >>> configuration = model.config ```Z longformeré