` is only effective if group beam search is enabled. The penalty applied to a beam's score when it generates a token that has already been chosen by another beam within the same group during the same time step. A higher `diversity_penalty` will enforce greater diversity among the beams, making it less likely for multiple beams to choose the same token. Conversely, a lower penalty will allow beams to more freely choose similar tokens. Adjusting this value can help strike a balance between diversity and natural likelihood. num_beams (`int`): Number of beams used for group beam search. Beam search is a method used that maintains beams (or "multiple hypotheses") at each step, expanding each one and keeping the top-scoring sequences. A higher `num_beams` will explore more potential sequences. This can increase chances of finding a high-quality output but also increases computational cost. num_beam_groups (`int`): Number of groups to divide `num_beams` into in order to ensure diversity among different groups of beams. Each group of beams will operate independently, selecting tokens without considering the choices of other groups. This division promotes diversity by ensuring that beams within different groups explore different paths. For instance, if `num_beams` is 6 and `num_beam_groups` is 2, there will be 2 groups each containing 3 beams. The choice of `num_beam_groups` should be made considering the desired level of output diversity and the total number of beams. See [this paper](https://arxiv.org/pdf/1610.02424.pdf) for more details. Examples: ```python >>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM >>> import torch >>> # Initialize the model and tokenizer >>> tokenizer = AutoTokenizer.from_pretrained("t5-base") >>> model = AutoModelForSeq2SeqLM.from_pretrained("t5-base") >>> # A long text about the solar system >>> text = "The Solar System is a gravitationally bound system comprising the Sun and the objects that orbit it, either directly or indirectly. Of the objects that orbit the Sun directly, the largest are the eight planets, with the remainder being smaller objects, such as the five dwarf planets and small Solar System bodies. The Solar System formed 4.6 billion years ago from the gravitational collapse of a giant interstellar molecular cloud." >>> inputs = tokenizer("summarize: " + text, return_tensors="pt") >>> # Generate diverse summary >>> outputs_diverse = model.generate( ... **inputs, ... num_beam_groups=2, ... diversity_penalty=10.0, ... max_length=100, ... num_beams=4, ... num_return_sequences=2, ... ) >>> summaries_diverse = tokenizer.batch_decode(outputs_diverse, skip_special_tokens=True) >>> # Generate non-diverse summary >>> outputs_non_diverse = model.generate( ... **inputs, ... max_length=100, ... num_beams=4, ... num_return_sequences=2, ... ) >>> summary_non_diverse = tokenizer.batch_decode(outputs_non_diverse, skip_special_tokens=True) >>> # With `diversity_penalty`, the resulting beams are much more diverse >>> print(summary_non_diverse) ['the solar system formed 4.6 billion years ago from the collapse of a giant interstellar molecular cloud. of the objects that orbit the Sun directly, the largest are the eight planets.', 'the Solar System formed 4.6 billion years ago from the collapse of a giant interstellar molecular cloud. of the objects that orbit the Sun directly, the largest are the eight planets.'] >>> print(summaries_diverse) ['the solar system formed 4.6 billion years ago from the collapse of a giant interstellar molecular cloud. of the objects that orbit the Sun directly, the largest are the eight planets.', 'the solar system formed 4.6 billion years ago from the collapse of a giant interstellar molecular cloud. of the objects that orbit the Sun directly, the largest are the eight planets. the rest of the objects are smaller objects, such as the five dwarf planets and small solar system bodies.'] ``` Ú