oat`, *optional*, defaults to 4.0): Ratio of mlp hidden dim to embedding dim. use_abs_pos (`bool`, *optional*, defaults to True): Whether to use absolute position embedding. use_rel_pos (`bool`, *optional*, defaults to True): Whether to use relative position embedding. window_size (`int`, *optional*, defaults to 14): Window size for relative position. global_attn_indexes (`List[int]`, *optional*, defaults to `[2, 5, 8, 11]`): The indexes of the global attention layers. num_pos_feats (`int`, *optional*, defaults to 128): The dimensionality of the position embedding. mlp_dim (`int`, *optional*, defaults to None): The dimensionality of the MLP layer in the Transformer encoder. If `None`, defaults to `mlp_ratio * hidden_size`. i