Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions docs/source-pytorch/common/gradient_accumulation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,6 @@ effective batch size is increased but there is no memory overhead.
step, the effective batch size on each device will remain ``N*K`` but right before the ``optimizer.step()``, the gradient sync will make the effective
batch size as ``P*N*K``. For DP, since the batch is split across devices, the final effective batch size will be ``N*K``.

.. seealso:: :class:`~lightning.pytorch.trainer.trainer.Trainer`

.. testcode::

# DEFAULT (ie: no accumulated grads)
Expand Down
63 changes: 8 additions & 55 deletions docs/source-pytorch/common/trainer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -239,14 +239,6 @@ Example::
accumulate_grad_batches
^^^^^^^^^^^^^^^^^^^^^^^

.. raw:: html

<video width="50%" max-width="400px" controls
poster="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/thumb/accumulate_grad_batches.jpg"
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/accumulate_grad_batches.mp4"></video>

|

Accumulates gradients over k batches before stepping the optimizer.

.. testcode::
Expand Down Expand Up @@ -316,23 +308,18 @@ Example::
callbacks
^^^^^^^^^

.. raw:: html

<video width="50%" max-width="400px" controls
poster="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/thumb/callbacks.jpg"
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/callbacks.mp4"></video>

|

Add a list of :class:`~lightning.pytorch.callbacks.callback.Callback`. Callbacks run sequentially in the order defined here
This argument can be used to add a :class:`~lightning.pytorch.callbacks.callback.Callback` or a list of them.
Callbacks run sequentially in the order defined here
with the exception of :class:`~lightning.pytorch.callbacks.model_checkpoint.ModelCheckpoint` callbacks which run
after all others to ensure all states are saved to the checkpoints.

.. code-block:: python

# single callback
trainer = Trainer(callbacks=PrintCallback())

# a list of callbacks
callbacks = [PrintCallback()]
trainer = Trainer(callbacks=callbacks)
trainer = Trainer(callbacks=[PrintCallback()])

Example::

Expand Down Expand Up @@ -389,7 +376,7 @@ Default path for logs and weights when no logger or
:class:`lightning.pytorch.callbacks.ModelCheckpoint` callback passed. On
certain clusters you might want to separate where logs and checkpoints are
stored. If you don't then use this argument for convenience. Paths can be local
paths or remote paths such as `s3://bucket/path` or 'hdfs://path/'. Credentials
paths or remote paths such as ``s3://bucket/path`` or ``hdfs://path/``. Credentials
will need to be set up to use remote filepaths.

.. testcode::
Expand Down Expand Up @@ -452,14 +439,6 @@ Number of devices to train on (``int``), which devices to train on (``list`` or
enable_checkpointing
^^^^^^^^^^^^^^^^^^^^

.. raw:: html

<video width="50%" max-width="400px" controls
poster="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/thumb/checkpoint_callback.jpg"
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/checkpoint_callback.mp4"></video>

|

By default Lightning saves a checkpoint for you in your current working directory, with the state of your last training epoch,
Checkpoints capture the exact value of all parameters used by a model.
To disable automatic checkpointing, set this to `False`.
Expand Down Expand Up @@ -545,12 +524,10 @@ gradient_clip_val

Gradient clipping value

- 0 means don't clip.

.. testcode::

# default used by the Trainer
trainer = Trainer(gradient_clip_val=0.0)
trainer = Trainer(gradient_clip_val=None)

limit_train_batches
^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -663,14 +640,6 @@ See Also:
logger
^^^^^^

.. raw:: html

<video width="50%" max-width="400px" controls
poster="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/thumb/logger.jpg"
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/logger.mp4"></video>

|

:doc:`Logger <../visualize/loggers>` (or iterable collection of loggers) for experiment tracking. A ``True`` value uses the default ``TensorBoardLogger`` shown below. ``False`` will disable logging.

.. testcode::
Expand Down Expand Up @@ -869,14 +838,6 @@ Useful for quickly debugging or trying to overfit on purpose.
plugins
^^^^^^^

.. raw:: html

<video width="50%" max-width="400px" controls
poster="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/thumb/cluster_environment.jpg"
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/cluster_environment.mp4"></video>

|

:ref:`Plugins` allow you to connect arbitrary backends, precision libraries, clusters etc. For example:

- :ref:`Checkpoint IO <checkpointing_expert>`
Expand Down Expand Up @@ -907,14 +868,6 @@ To define your own behavior, subclass the relevant class and pass it in. Here's
precision
^^^^^^^^^

.. raw:: html

<video width="50%" max-width="400px" controls
poster="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/thumb/precision.jpg"
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/precision.mp4"></video>

|

Lightning supports either double (64), float (32), bfloat16 (bf16), or half (16) precision training.

Half precision, or mixed precision, is the combined use of 32 and 16 bit floating points to reduce memory footprint during model training. This can result in improved performance, achieving +3X speedups on modern GPUs.
Expand Down
29 changes: 4 additions & 25 deletions docs/source-pytorch/extensions/callbacks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,24 +7,14 @@
Callback
########

.. raw:: html

<video width="100%" max-width="400px" controls
poster="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/thumb/callbacks.jpg"
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/callbacks.mp4"></video>

|

A callback is a self-contained program that can be reused across projects.
Callbacks allow you to add arbitrary self-contained programs to your training.
At specific points during the flow of execution (hooks), the Callback interface allows you to design programs that encapsulate a full set of functionality.
It de-couples functionality that does not need to be in the :doc:`lightning module <../common/lightning_module>` and can be shared across projects.

Lightning has a callback system to execute them when needed. Callbacks should capture NON-ESSENTIAL
logic that is NOT required for your :doc:`lightning module <../common/lightning_module>` to run.

Here's the flow of how the callback hooks are executed:

.. raw:: html

<video width="100%" max-width="400px" controls autoplay muted playsinline src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/pt_callbacks_mov.m4v"></video>
A complete list of Callback hooks can be found in :class:`~lightning.pytorch.callbacks.callback.Callback`.

An overall Lightning system should have:

Expand Down Expand Up @@ -54,19 +44,8 @@ Example:
We successfully extended functionality without polluting our super clean
:doc:`lightning module <../common/lightning_module>` research code.

-----------

********
Examples
********
You can do pretty much anything with callbacks.

- `Add a MLP to fine-tune self-supervised networks <https://lightning-bolts.readthedocs.io/en/latest/callbacks/self_supervised.html#sslonlineevaluator>`_.
- `Find how to modify an image input to trick the classification result <https://lightning-bolts.readthedocs.io/en/latest/callbacks/vision.html#confused-logit>`_.
- `Interpolate the latent space of any variational model <https://lightning-bolts.readthedocs.io/en/latest/callbacks/variational.html#latent-dim-interpolator>`_.
- `Log images to Tensorboard for any model <https://lightning-bolts.readthedocs.io/en/latest/callbacks/vision.html#tensorboard-image-generator>`_.


--------------

******************
Expand Down