m the decimal part percents of the last epoch before stopping training). max_steps (`int`, *optional*, defaults to -1): If set to a positive number, the total number of training steps to perform. Overrides `num_train_epochs`. In case of using a finite iterable dataset the training may stop before reaching the set number of steps when all data is exhausted. gradient_accumulation_steps (`int`, *optional*, defaults to 1): Number of updates steps to accumulate the gradients for, before performing a backward/update pass. When using gradient accumulation, one step is counted as one step with backward pass. Therefore, logging, evaluation, save will be conducted every `gradient_accumulation_steps * xxx_step` training examples. seed (`int`, *optional*, defaults to 42): Random seed that will be set at the beginning of training. To ensure reproducibility across runs, use the [`~Trainer.model_init`] function to instantiate the model if it has some randomly initialized parameters. gradient_checkpointing (`bool`, *optional*, defaults to `False`): If True, use gradient checkpointing to save memory at the expense of slower backward pass. Example: ```py >>> from transformers import TrainingArguments >>> args = TrainingArguments("working_dir") >>> args = args.set_training(learning_rate=1e-4, batch_size=32) >>> args.learning_rate 1e-4 ``` T) r[