d using [`~FlaxPreTrainedModel.save_pretrained`], e.g., `./my_model_directory/`. decoder_pretrained_model_name_or_path (`Union[str, os.PathLike]`, *optional*, defaults to `None`): Information necessary to initiate the decoder. Can be either: - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids can be located at the root-level, like `bert-base-uncased`, or namespaced under a user or organization name, like `dbmdz/bert-base-german-cased`. - A path to a *directory* containing model weights saved using [`~FlaxPreTrainedModel.save_pretrained`], e.g., `./my_model_directory/`. model_args (remaining positional arguments, *optional*): All remaning positional arguments will be passed to the underlying model's `__init__` method. kwargs (remaining dictionary of keyword arguments, *optional*): Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). - To update the encoder configuration, use the prefix *encoder_* for each configuration parameter. - To update the decoder configuration, use the prefix *decoder_* for each configuration parameter. - To update the parent model configuration, do not use a prefix for each configuration parameter. Behaves differently depending on whether a `config` is provided or automatically loaded. Example: ```python >>> from transformers import FlaxVisionEncoderDecoderModel >>> # initialize a vit-gpt2 from a pretrained ViT and a pretrained GPT2 model. Note that the cross-attention layers will be randomly initialized >>> model = FlaxVisionEncoderDecoderModel.from_encoder_decoder_pretrained( ... "google/vit-base-patch16-224-in21k", "gpt2" ... ) >>> # saving model after fine-tuning >>> model.save_pretrained("./vit-gpt2") >>> # load fine-tuned model >>> model = FlaxVisionEncoderDecoderModel.from_pretrained("./vit-gpt2") ```c