Introduce a mechanism to set the epoch on the sampler in LightningLite

## 🚀 Feature

We need a mechanism to set the epoch on the distributed sampler via `.set_epoch()`. 

### Motivation

To correctly handle shuffling with the DistributedSampler in DDP, the PyTorch user would normally call

```py
sampler.set_epoch(epoch)
```

in their training loop.  PL handles this automatically for the user, but in Lite, the training loop is owned by the user and hence should be signaled in a different way. 


### Pitch

Provide a strategy-agnostic API to set the epoch. The idea here is that you don't need to change the code or add boilerplate conditional logic to handle the sampler when you switch from DDP to single device or vice versa. 

Before

```py
....
train_dataloader, val_dataloader = self.setup_dataloaders(train_dataloader, val_dataloader)

for epoch in range(num_epochs):

    # Beginning of a new epoch

    # Boilerplate code
    if isinstance(train_dataloader.sampler, DistributedSampler):
        train_dataloader.sampler.set_epoch(epoch)


    for idx, data in enumerate(train_dataloader):
        ...

```


Now:

```py
....
train_dataloader, val_dataloader = self.setup_dataloaders(train_dataloader, val_dataloader)

for epoch in range(num_epochs):

    # Beginning of a new epoch

    # This is a no-op if not using distributed sampler (DDP)
    self.set_epoch(epoch, train_dataloader)

    for idx, data in enumerate(train_dataloader):
        ...
```

### Alternatives

Don't introduce this. It is left to the user to handle this boilerplate code. 

### Additional context

[PyTorch docs for set_epoch
](https://pytorch.org/docs/stable/data.html#torch.utils.data.distributed.DistributedSampler)

______________________________________________________________________

#### If you enjoy Lightning, check out our other projects! ⚡

- [**Metrics**](https://github.com/Lightning-AI/metrics): Machine learning metrics for distributed, scalable PyTorch applications.

- [**Lite**](https://pytorch-lightning.readthedocs.io/en/latest/starter/lightning_lite.html): enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

- [**Flash**](https://github.com/Lightning-AI/lightning-flash): The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

- [**Bolts**](https://github.com/Lightning-AI/lightning-bolts): Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

- [**Lightning Transformers**](https://github.com/Lightning-AI/lightning-transformers): Flexible interface for high-performance research using SOTA Transformers leveraging PyTorch Lightning, Transformers, and Hydra.


cc @borda @carmocca @justusschock @awaelchli

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce a mechanism to set the epoch on the sampler in LightningLite #14636

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

If you enjoy Lightning, check out our other projects! ⚡

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Introduce a mechanism to set the epoch on the sampler in LightningLite #14636

Description

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

If you enjoy Lightning, check out our other projects! ⚡

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions