0.02): The standard deviation of the truncated_normal_initializer for initializing all weight matrices. layer_norm_eps (`float`, *optional*, defaults to 1e-12): The epsilon used by the layer normalization layers. bypass_transformer (`bool`, *optional*, defaults to `False`): Whether or not the model should bypass the transformer for the visual embeddings. If set to `True`, the model directly concatenates the visual embeddings from [`VisualBertEmbeddings`] with text output from transformers, and then pass it to a self-attention layer. special_visual_initialize (`bool`, *optional*, defaults to `True`): Whether or not the visual token type and position type embedding weights should be initialized the same as the textual token type and positive type embeddings. When set to `True`, the weights of the textual token type and position type embeddings are copied to the respective visual embedding layers. Example: ```python >>> from transformers import VisualBertConfig, VisualBertModel >>> # Initializing a VisualBERT visualbert-vqa-coco-pre style configuration >>> configuration = VisualBertConfig.from_pretrained("uclanlp/visualbert-vqa-coco-pre") >>> # Initializing a model (with random weights) from the visualbert-vqa-coco-pre style configuration >>> model = VisualBertModel(configuration) >>> # Accessing the model configuration >>> configuration = model.config ```Z