Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 43 additions & 9 deletions docs/source-pytorch/cli/lightning_cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,25 @@

.. _lightning-cli:

############################
Eliminate config boilerplate
############################
######################################
Configure hyperparameters from the CLI
######################################

*************
Why use a CLI
*************

When running deep learning experiments there are a couple good practices that are recommended to follow:

- Separate configuration from source code
- Guarantee reproducibility of experiments

Implementing a command line interface (CLI) makes possible to execute an experiment from a shell terminal. By having a
CLI, there is a clear separation between the Python source code and what hyperparameters are used for a particular
experiment. If the CLI corresponds to a stable version of the code, then reproducing an experiment can be achieved by
installing the same version of the code plus dependencies and running with the same configuration (CLI arguments).

----

*********
Basic use
Expand All @@ -26,7 +42,7 @@ Basic use
:tag: intermediate

.. displayitem::
:header: 2: Mix models and datasets
:header: 2: Mix models, datasets and optimizers
:description: Support multiple models, datasets, optimizers and learning rate schedulers
:col_css: col-md-4
:button_link: lightning_cli_intermediate_2.html
Expand Down Expand Up @@ -60,34 +76,52 @@ Advanced use
.. displayitem::
:header: YAML for production
:description: Use the Lightning CLI with YAMLs for production environments
:col_css: col-md-6
:col_css: col-md-4
:button_link: lightning_cli_advanced_2.html
:height: 150
:tag: advanced

.. displayitem::
:header: Customize for complex projects
:description: Learn how to implement CLIs for complex projects.
:col_css: col-md-6
:description: Learn how to implement CLIs for complex projects
:col_css: col-md-4
:button_link: lightning_cli_advanced_3.html
:height: 150
:tag: expert
:tag: advanced

.. displayitem::
:header: Extend the Lightning CLI
:description: Customize the Lightning CLI
:col_css: col-md-6
:col_css: col-md-4
:button_link: lightning_cli_expert.html
:height: 150
:tag: expert

----

*************
Miscellaneous
*************

.. raw:: html

<div class="display-card-container">
<div class="row">

.. displayitem::
:header: FAQ
:description: Frequently asked questions about working with the Lightning CLI and YAML files
:col_css: col-md-6
:button_link: lightning_cli_faq.html
:height: 150

.. displayitem::
:header: Legacy CLIs
:description: Documentation for the legacy argparse-based CLIs
:col_css: col-md-6
:button_link: ../common/hyperparameters.html
:height: 150

.. raw:: html

</div>
Expand Down
174 changes: 115 additions & 59 deletions docs/source-pytorch/cli/lightning_cli_advanced.rst
Original file line number Diff line number Diff line change
@@ -1,113 +1,169 @@
:orphan:

#######################################
Eliminate config boilerplate (Advanced)
#######################################
#################################################
Configure hyperparameters from the CLI (Advanced)
#################################################
**Audience:** Users looking to modularize their code for a professional project.

**Pre-reqs:** You must have read :doc:`(Control it all from the CLI) <lightning_cli_intermediate>`.
**Pre-reqs:** You must have read :doc:`(Mix models and datasets) <lightning_cli_intermediate_2>`.

As a project becomes more complex, the number of configurable options becomes very large, making it inconvenient to
control through individual command line arguments. To address this, CLIs implemented using
:class:`~pytorch_lightning.cli.LightningCLI` always support receiving input from configuration files. The default format
used for config files is yaml.

.. tip::

If you are unfamiliar with yaml, have a look at the short introduction at `realpython.com#yaml-syntax
<https://realpython.com/python-yaml/#yaml-syntax>`__.


----

***************************
What is a yaml config file?
***************************
A yaml is a standard configuration file that describes parameters for sections of a program. It is a common tool in engineering, and it has recently started to gain popularity in machine learning.
***********************
Run using a config file
***********************
To run the CLI using a yaml config, do:

.. code:: yaml
.. code:: bash
# file.yaml
car:
max_speed:100
max_passengers:2
plane:
fuel_capacity: 50
class_3:
option_1: 'x'
option_2: 'y'
python main.py fit --config config.yaml
Individual arguments can be given to override options in the config file:

.. code:: bash
python main.py fit --config config.yaml --trainer.max_epochs 100
----

************************
Automatic save of config
************************

*********************
Print the config used
*********************
Before or after you run a training routine, you can print the full training spec in yaml format using ``--print_config``:
To ease experiment reporting and reproducibility, by default ``LightningCLI`` automatically saves the full yaml
configuration in the log directory. After multiple fit runs with different hyperparameters, each one will have in its
respective log directory a ``config.yaml`` file. These files can be used to trivially reproduce an experiment, e.g.:

.. code:: bash
python main.py fit --config lightning_logs/version_7/config.yaml
The automatic saving of the config is done by the special callback :class:`~pytorch_lightning.cli.SaveConfigCallback`.
This callback is automatically added to the ``Trainer``. To disable the save of the config instantiate ``LightningCLI``
with ``save_config_callback=None``.

----

*********************************
Prepare a config file for the CLI
*********************************
The ``--help`` option of the CLIs can be used learn which configuration options are available and how to use them.
However, writing a config from scratch can be time consuming and error prone. To alleviate this, the CLIs have the
``--print_config`` argument, which prints to stdout the configuration without running the command.

For a CLI implemented as ``LightningCLI(DemoModel, BoringDataModule)``, executing:

.. code:: bash
python main.py fit --print_config
which generates the following config:
generates a config with all default values like the following:

.. code:: bash
seed_everything: null
trainer:
logger: true
...
terminate_on_nan: null
logger: true
...
model:
out_dim: 10
learning_rate: 0.02
out_dim: 10
learning_rate: 0.02
data:
data_dir: ./
data_dir: ./
ckpt_path: null
----

********************************
Write a config yaml from the CLI
********************************
To have a copy of the configuration that produced this model, save a *yaml* file from the *--print_config* outputs:
Other command line arguments can be given and will be considered in the printed configuration. A use case for this is CLIs
that accept multiple models. By default no model is selected, which means that the printed config will not include model
settings. To get a config with the default values of a particular model would be:

.. code:: bash
python main.py fit --model.learning_rate 0.001 --print_config > config.yaml
python main.py fit --model DemoModel --print_config
----

**********************
Run from a single yaml
**********************
To run from a yaml, pass a yaml produced with ``--print_config`` to the ``--config`` argument:
which generates a config like:

.. code:: bash
python main.py fit --config config.yaml
seed_everything: null
trainer:
...
model:
class_path: pytorch_lightning.demos.boring_classes.DemoModel
init_args:
out_dim: 10
learning_rate: 0.02
ckpt_path: null
when using a yaml to run, you can still pass in inline arguments
.. tip::

.. code:: bash
A standard procedure to run experiments can be:

python main.py fit --config config.yaml --trainer.max_epochs 100
.. code:: bash
# Print a configuration to have as reference
python main.py fit --print_config > config.yaml
# Modify the config to your liking - you can remove all default arguments
nano config.yaml
# Fit your model using the edited configuration
python main.py fit --config config.yaml
----

******************
Compose yaml files
******************
For production or complex research projects it's advisable to have each object in its own config file. To compose all the configs, pass them all inline:
********************
Compose config files
********************
Multiple config files can be provided and they will be parsed sequentially. Let's say we have two configs with common
settings:

.. code:: yaml
# config_1.yaml
trainer:
num_epochs: 10
...
# config_2.yaml
trainer:
num_epochs: 20
...
The value from the last config will be used, ``num_epochs = 20`` in this case:

.. code-block:: bash
$ python trainer.py fit --config trainer.yaml --config datamodules.yaml --config models.yaml ...
$ python main.py fit --config config_1.yaml --config config_2.yaml
The configs will be parsed sequentially. Let's say we have two configs with the same args:
----

*********************
Use groups of options
*********************
Groups of options can also be given as independent config files. For configs like:

.. code:: yaml
# trainer.yaml
trainer:
num_epochs: 10
num_epochs: 10
# model.yaml
out_dim: 7
# trainer_2.yaml
trainer:
num_epochs: 20
# data.yaml
data_dir: ./data
the ones from the last config will be used (num_epochs = 20) in this case:
a fit command can be run as:

.. code-block:: bash
$ python trainer.py fit --config trainer.yaml --config trainer_2.yaml
$ python main.py fit --trainer trainer.yaml --model model.yaml --data data.yaml [...]
Loading