ep() and scheduler step() 3. Dump state_dict of the current sparsifier on_train_end() squash mask c