he number of points-per-side sampled in layer n is scaled down by crop_n_points_downscale_factor**n. device (`torch.device`, *optional*, defaults to None): Device to use for the computation. If None, cpu will be used. input_data_format (`str` or `ChannelDimension`, *optional*): The channel dimension format of the input image. If not provided, it will be inferred. return_tensors (`str`, *optional*, defaults to `pt`): If `pt`, returns `torch.Tensor`. If `tf`, returns `tf.Tensor`. rn