es of input sequence tokens in the vocabulary.
        attention_mask:
            Mask to avoid performing attention on padding token indices.
        idf:
            A bool indicating whether normalization using inverse document frequencies should be used.
        batch_size:
            A batch size used for model processing.
        num_workers:
            A number of workers to use for a dataloader.

    Return:
        An instance of ``torch.utils.data.DataLoader`` used for iterating over examples.

    )