e mel spectrogram and prepares it for the mode based on the `truncation` and `padding` arguments. Four different path are possible: - `truncation="fusion"` and the length of the waveform is greater than the max length: the mel spectrogram will be computed on the entire audio. 3 random crops and a dowsampled version of the full mel spectrogram are then stacked together. They will later be used for `feature_fusion`. - `truncation="rand_trunc"` and the length of the waveform is smaller than the max length: the audio is padded based on `padding`. - `truncation="fusion"` and the length of the waveform is smaller than the max length: the audio is padded based on `padding`, and is repeated `4` times. - `truncation="rand_trunc"` and the length of the waveform is greater than the max length: the mel spectrogram will be computed on a random crop of the waveform. r