omputing attention (dot-product) and upcast attention dot-product/softmax to float() when training with mixed precision. Example: ```python >>> from transformers import DecisionTransformerConfig, DecisionTransformerModel >>> # Initializing a DecisionTransformer configuration >>> configuration = DecisionTransformerConfig() >>> # Initializing a model (with random weights) from the configuration >>> model = DecisionTransformerModel(configuration) >>> # Accessing the model configuration >>> configuration = model.config ```Z