Skip to content

Fully Sharded Training clip_grad_norm_ Β #13339

@wangleiofficial

Description

@wangleiofficial

πŸš€ Feature

FSDP does not supprot the gradient_clip_val setting in Trainer

Motivation

Pitch

Alternatives

Additional context


If you enjoy Lightning, check out our other projects! ⚑

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

  • Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

  • Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @SeanNaren @awaelchli @rohitgr7 @akihironitta

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions