Skip to content

Trainer does not switch to train mode after validation step #20177

@ClemensSchwarke

Description

@ClemensSchwarke

Bug description

After the validation step, the model is not set back to train mode because the following hook

def on_validation_model_train(self) -> None:
     """Called when the validation loop ends.

     The validation loop by default restores the `training` mode of the LightningModule to what it was before
     starting validation. Override this hook to change the behavior. See also
     :meth:`~pytorch_lightning.core.hooks.ModelHooks.on_validation_model_eval`.

     """
     # The loop won't call this hook unless it is overridden. The line below is here in case the user calls super().
     self.trainer.model.train()

is not called as can be seen here:

 def _on_evaluation_model_train(self) -> None:
        """Undoes the eval mode."""
        trainer = self.trainer
        hook_name = "on_test_model_train" if trainer.testing else "on_validation_model_train"
        if is_overridden(hook_name, trainer.lightning_module):
            call._call_lightning_module_hook(trainer, hook_name)
        else:
            self._module_mode.restore(trainer.lightning_module)

I don't see the point of this behavior and I think it is very likely to cause bugs if people start implementing their own mode logic and expect it to be called. Would be happy to understand this (:

What version are you seeing the problem on?

v2.4

How to reproduce the bug

def train(self, mode=True):
  super().train(mode)
  if mode:
    print("set to training mode")
  else:
    print("set to evaluation mode")

Error messages and logs

Epoch 0: 100%
set to evaluation mode
Epoch 1:  27%

Environment

Current environment
#- PyTorch Lightning Version (e.g., 2.4.0):
#- PyTorch Version (e.g., 2.4):
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):

More info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions