ull"` and `"block_sparse"`. use_bias (`bool`, *optional*, defaults to `True`) Whether to use bias in query, key, value. rescale_embeddings (`bool`, *optional*, defaults to `False`) Whether to rescale embeddings with (hidden_size ** 0.5). block_size (`int`, *optional*, defaults to 64) Size of each block. Useful only when `attention_type == "block_sparse"`. num_random_blocks (`int`, *optional*, defaults to 3) Each query is going to attend these many number of random blocks. Useful only when `attention_type == "block_sparse"`. classifier_dropout (`float`, *optional*): The dropout ratio for the classification head. Example: ```python >>> from transformers import BigBirdConfig, BigBirdModel >>> # Initializing a BigBird google/bigbird-roberta-base style configuration >>> configuration = BigBirdConfig() >>> # Initializing a model (with random weights) from the google/bigbird-roberta-base style configuration >>> model = BigBirdModel(configuration) >>> # Accessing the model configuration >>> configuration = model.config ```Zbig_birdé¶Ä