weights associated with the model, only the configuration. Check out the [`~PreTrainedModel.from_pretrained`] method to load the model weights. is_sparse (`bool`): Whether the MLP layer is a `Sparse` layer (contains a Mixture of Experts) or not Fr: