(potentially sharded) checkpoint inside a model, potentially sending weights to a given device as they are loaded. Once loaded across devices, you still need to call [`dispatch_model`] on your model to make it able to run. To group the checkpoint loading and dispatch in one single call, use [`load_checkpoint_and_dispatch`]. Args: model (`torch.nn.Module`): The model in which we want to load a checkpoint. checkpoint (`str` or `os.PathLike`): The folder checkpoint to load. It can be: - a path to a file containing a whole model state dict - a path to a `.json` file containing the index to a sharded checkpoint - a path to a folder containing a unique `.index.json` file and the shards of a checkpoint. - a path to a folder containing a unique pytorch_model.bin file. device_map (`Dict[str, Union[int, str, torch.device]]`, *optional*): A map that specifies where each submodule should go. It doesn't need to be refined to each parameter/buffer name, once a given module name is inside, every submodule of it will be sent to the same device. offload_folder (`str` or `os.PathLike`, *optional*): If the `device_map` contains any value `"disk"`, the folder where we will offload weights. dtype (`str` or `torch.dtype`, *optional*): If provided, the weights will be converted to that type when loaded. offload_state_dict (`bool`, *optional*, defaults to `False`): If `True`, will temporarily offload the CPU state dict on the hard drive to avoid getting out of CPU RAM if the weight of the CPU state dict + the biggest shard does not fit. offload_buffers (`bool`, *optional*, defaults to `False`): Whether or not to include the buffers in the weights offloaded to disk. keep_in_fp32_modules(`List[str]`, *optional*): A list of the modules that we keep in `torch.float32` dtype. offload_8bit_bnb (`bool`, *optional*): Whether or not to enable offload of 8-bit modules on cpu/disk. r