is needed for :mod:`torchvision.transforms.v2`. Segmentation datasets Segmentation datasets, e.g. :class:`~torchvision.datasets.VOCSegmentation`, return a two-tuple of :class:`PIL.Image.Image`'s. This wrapper leaves the image as is (first item), while wrapping the segmentation mask into a :class:`~torchvision.tv_tensors.Mask` (second item). Video classification datasets Video classification datasets, e.g. :class:`~torchvision.datasets.Kinetics`, return a three-tuple containing a :class:`torch.Tensor` for the video and audio and a :class:`int` as label. This wrapper wraps the video into a :class:`~torchvision.tv_tensors.Video` while leaving the other items as is. .. note:: Only datasets constructed with ``output_format="TCHW"`` are supported, since the alternative ``output_format="THWC"`` is not supported by :mod:`torchvision.transforms.v2`. Args: dataset: the dataset instance to wrap for compatibility with transforms v2. target_keys: Target keys to return in case the target is a dictionary. If ``None`` (default), selected keys are specific to the dataset. If ``"all"``, returns the full target. Can also be a collection of strings for fine grained access. Currently only supported for :class:`~torchvision.datasets.CocoDetection`, :class:`~torchvision.datasets.VOCDetection`, :class:`~torchvision.datasets.Kitti`, and :class:`~torchvision.datasets.WIDERFace`. See above for details. NÚ