Skip to content

Commit 14f441c

Browse files
authored
Deprecate nvidia/apex (#16039)
1 parent 7cbdc68 commit 14f441c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+514
-450
lines changed

docs/source-pytorch/accelerators/gpu_intermediate.rst

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -469,25 +469,26 @@ Validation and test step have the same option when using DP.
469469
Distributed and 16-bit precision
470470
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
471471

472-
Due to an issue with Apex and DataParallel (PyTorch and NVIDIA issue), Lightning does
473-
not allow 16-bit and DP training. We tried to get this to work, but it's an issue on their end.
474-
475472
Below are the possible configurations we support.
476473

477474
+-------+---------+-----+-----+--------+-----------------------------------------------------------------------+
478-
| 1 GPU | 1+ GPUs | DP | DDP | 16-bit | command |
475+
| 1 GPU | 1+ GPUs | DDP | DP | 16-bit | command |
479476
+=======+=========+=====+=====+========+=======================================================================+
480477
| Y | | | | | `Trainer(accelerator="gpu", devices=1)` |
481478
+-------+---------+-----+-----+--------+-----------------------------------------------------------------------+
482479
| Y | | | | Y | `Trainer(accelerator="gpu", devices=1, precision=16)` |
483480
+-------+---------+-----+-----+--------+-----------------------------------------------------------------------+
484-
| | Y | Y | | | `Trainer(accelerator="gpu", devices=k, strategy='dp')` |
481+
| | Y | Y | | | `Trainer(accelerator="gpu", devices=k, strategy='ddp')` |
482+
+-------+---------+-----+-----+--------+-----------------------------------------------------------------------+
483+
| | Y | Y | | Y | `Trainer(accelerator="gpu", devices=k, strategy='ddp', precision=16)` |
485484
+-------+---------+-----+-----+--------+-----------------------------------------------------------------------+
486-
| | Y | | Y | | `Trainer(accelerator="gpu", devices=k, strategy='ddp')` |
485+
| | Y | | Y | | `Trainer(accelerator="gpu", devices=k, strategy='dp')` |
487486
+-------+---------+-----+-----+--------+-----------------------------------------------------------------------+
488-
| | Y | | Y | Y | `Trainer(accelerator="gpu", devices=k, strategy='ddp', precision=16)` |
487+
| | Y | | Y | Y | `Trainer(accelerator="gpu", devices=k, strategy='dp', precision=16)` |
489488
+-------+---------+-----+-----+--------+-----------------------------------------------------------------------+
490489

490+
DDP and DP can also be used with 1 GPU, but there's no reason to do so other than debugging distributed-related issues.
491+
491492

492493
Implement Your Own Distributed (DDP) training
493494
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

docs/source-pytorch/api_references.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -184,15 +184,14 @@ precision
184184
:nosignatures:
185185
:template: classtemplate.rst
186186

187-
ApexMixedPrecisionPlugin
188187
ColossalAIPrecisionPlugin
189188
DeepSpeedPrecisionPlugin
190189
DoublePrecisionPlugin
191190
FullyShardedNativeMixedPrecisionPlugin
192191
FullyShardedNativeNativeMixedPrecisionPlugin
193192
HPUPrecisionPlugin
194193
IPUPrecisionPlugin
195-
NativeMixedPrecisionPlugin
194+
MixedPrecisionPlugin
196195
PrecisionPlugin
197196
ShardedNativeMixedPrecisionPlugin
198197
TPUBf16PrecisionPlugin

docs/source-pytorch/common/checkpointing_basic.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -186,5 +186,5 @@ If you don't just want to load weights, but instead restore the full training, d
186186
model = LitModel()
187187
trainer = Trainer()
188188
189-
# automatically restores model, epoch, step, LR schedulers, apex, etc...
189+
# automatically restores model, epoch, step, LR schedulers, etc...
190190
trainer.fit(model, ckpt_path="some/path/to/my_checkpoint.ckpt")

docs/source-pytorch/common/optimization.rst

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,6 @@ For example, here step optimizer A every batch and optimizer B every 2 batches.
151151
optimizer_idx,
152152
optimizer_closure,
153153
on_tpu=False,
154-
using_native_amp=False,
155154
using_lbfgs=False,
156155
):
157156
# update generator every step
@@ -183,7 +182,6 @@ Here we add a manual learning rate warm-up without an lr scheduler.
183182
optimizer_idx,
184183
optimizer_closure,
185184
on_tpu=False,
186-
using_native_amp=False,
187185
using_lbfgs=False,
188186
):
189187
# update params
@@ -215,7 +213,6 @@ to perform a step, Lightning won't be able to support accelerators, precision an
215213
optimizer_idx,
216214
optimizer_closure,
217215
on_tpu=False,
218-
using_native_amp=False,
219216
using_lbfgs=False,
220217
):
221218
optimizer.step(closure=optimizer_closure)
@@ -232,7 +229,6 @@ to perform a step, Lightning won't be able to support accelerators, precision an
232229
optimizer_idx,
233230
optimizer_closure,
234231
on_tpu=False,
235-
using_native_amp=False,
236232
using_lbfgs=False,
237233
):
238234
optimizer = optimizer.optimizer

docs/source-pytorch/common/precision_intermediate.rst

Lines changed: 1 addition & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ FP16 Mixed Precision
5858
********************
5959

6060
In most cases, mixed precision uses FP16. Supported `PyTorch operations <https://pytorch.org/docs/stable/amp.html#op-specific-behavior>`__ automatically run in FP16, saving memory and improving throughput on the supported accelerators.
61+
Since computation happens in FP16, there is a chance of numerical instability during training. This is handled internally by a dynamic grad scaler which skips invalid steps and adjusts the scaler to ensure subsequent steps fall within a finite range. For more information `see the autocast docs <https://pytorch.org/docs/stable/amp.html#gradient-scaling>`__.
6162

6263

6364
.. note::
@@ -69,46 +70,6 @@ In most cases, mixed precision uses FP16. Supported `PyTorch operations <https:/
6970

7071
Trainer(accelerator="gpu", devices=1, precision=16)
7172

72-
73-
PyTorch Native
74-
--------------
75-
76-
PyTorch 1.6 release introduced mixed precision functionality into their core as the AMP package, `torch.cuda.amp <https://pytorch.org/docs/stable/amp.html>`__. It is more flexible and intuitive compared to `NVIDIA APEX <https://github.com/NVIDIA/apex>`__.
77-
Since computation happens in FP16, there is a chance of numerical instability during training. This is handled internally by a dynamic grad scaler which skips invalid steps and adjusts the scaler to ensure subsequent steps fall within a finite range. For more information `see the autocast docs <https://pytorch.org/docs/stable/amp.html#gradient-scaling>`__.
78-
Lightning uses native amp by default with ``precision=16|"bf16"``. You can also set it using:
79-
80-
.. testcode::
81-
82-
Trainer(precision=16, amp_backend="native")
83-
84-
85-
NVIDIA APEX
86-
-----------
87-
88-
.. warning::
89-
90-
We strongly recommend using the above native mixed precision rather than NVIDIA APEX unless you require more refined control.
91-
92-
`NVIDIA APEX <https://github.com/NVIDIA/apex>`__ offers additional flexibility in setting mixed precision. This can be useful when trying out different precision configurations, such as keeping most of your weights in FP16 and running computation in FP16.
93-
94-
.. testcode::
95-
:skipif: not _APEX_AVAILABLE or not torch.cuda.is_available()
96-
97-
Trainer(accelerator="gpu", devices=1, amp_backend="apex", precision=16)
98-
99-
Set the `NVIDIA optimization level <https://nvidia.github.io/apex/amp.html#opt-levels>`__ via the precision plugin.
100-
101-
.. testcode::
102-
:skipif: not _APEX_AVAILABLE or not torch.cuda.is_available()
103-
104-
from pytorch_lightning.plugins import ApexMixedPrecisionPlugin
105-
106-
107-
apex_plugin = ApexMixedPrecisionPlugin(amp_level="O3")
108-
Trainer(accelerator="gpu", devices=1, precision=16, plugins=[apex_plugin])
109-
110-
----
111-
11273
************************
11374
BFloat16 Mixed Precision
11475
************************

docs/source-pytorch/common/trainer.rst

Lines changed: 0 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -289,27 +289,6 @@ Example::
289289
# no accumulation for epochs 1-4. accumulate 3 for epochs 5-10. accumulate 20 after that
290290
trainer = Trainer(accumulate_grad_batches={5: 3, 10: 20})
291291

292-
amp_backend
293-
^^^^^^^^^^^
294-
295-
.. raw:: html
296-
297-
<video width="50%" max-width="400px" controls
298-
poster="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/thumb/amp_backend.jpg"
299-
src="https://pl-bolts-doc-images.s3.us-east-2.amazonaws.com/pl_docs/trainer_flags/amp_backend.mp4"></video>
300-
301-
|
302-
303-
Use PyTorch AMP ('native'), or NVIDIA apex ('apex').
304-
305-
.. testcode::
306-
307-
# using PyTorch built-in AMP, default used by the Trainer
308-
trainer = Trainer(amp_backend="native")
309-
310-
# using NVIDIA Apex
311-
trainer = Trainer(amp_backend="apex")
312-
313292
auto_scale_batch_size
314293
^^^^^^^^^^^^^^^^^^^^^
315294

@@ -1156,27 +1135,6 @@ Half precision, or mixed precision, is the combined use of 32 and 16 bit floatin
11561135

11571136
.. note:: When running on TPUs, torch.bfloat16 will be used but tensor printing will still show torch.float32.
11581137

1159-
.. admonition:: If you are interested in using Apex 16-bit training:
1160-
:class: dropdown
1161-
1162-
NVIDIA Apex and DDP have instability problems. We recommend using the native AMP for 16-bit precision with multiple GPUs.
1163-
To use Apex 16-bit training:
1164-
1165-
1. `Install apex. <https://github.com/NVIDIA/apex#quick-start>`__
1166-
1167-
2. Set the ``precision`` trainer flag to 16. You can customize the `Apex optimization level <https://nvidia.github.io/apex/amp.html#opt-levels>`_ by setting the ``amp_level`` flag
1168-
in the precision plugin.
1169-
1170-
.. testcode::
1171-
:skipif: not _APEX_AVAILABLE or not torch.cuda.is_available()
1172-
1173-
from pytorch_lightning.plugins import ApexMixedPrecisionPlugin
1174-
1175-
1176-
apex_plugin = ApexMixedPrecisionPlugin(amp_level="O2")
1177-
# turn on 16-bit
1178-
trainer = Trainer(accelerator="gpu", devices=1, precision=16, plugins=[apex_plugin])
1179-
11801138
profiler
11811139
^^^^^^^^
11821140

docs/source-pytorch/conf.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -398,7 +398,6 @@ def package_list_from_file(file):
398398
from pytorch_lightning.callbacks import Callback
399399
from pytorch_lightning.cli import _JSONARGPARSE_SIGNATURES_AVAILABLE as _JSONARGPARSE_AVAILABLE
400400
from pytorch_lightning.utilities import (
401-
_APEX_AVAILABLE,
402401
_TORCHVISION_AVAILABLE,
403402
)
404403
from pytorch_lightning.loggers.neptune import _NEPTUNE_AVAILABLE

docs/source-pytorch/extensions/plugins.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,15 +52,14 @@ The full list of built-in precision plugins is listed below.
5252
:nosignatures:
5353
:template: classtemplate.rst
5454

55-
ApexMixedPrecisionPlugin
5655
ColossalAIPrecisionPlugin
5756
DeepSpeedPrecisionPlugin
5857
DoublePrecisionPlugin
5958
FullyShardedNativeMixedPrecisionPlugin
6059
FullyShardedNativeNativeMixedPrecisionPlugin
6160
HPUPrecisionPlugin
6261
IPUPrecisionPlugin
63-
NativeMixedPrecisionPlugin
62+
MixedPrecisionPlugin
6463
PrecisionPlugin
6564
ShardedNativeMixedPrecisionPlugin
6665
TPUBf16PrecisionPlugin

docs/source-pytorch/model/manual_optimization.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -319,4 +319,4 @@ Here is an example using a closure function.
319319
opt.step(closure=closure)
320320

321321
.. warning::
322-
The :class:`~torch.optim.LBFGS` optimizer is not supported for apex AMP, native AMP, IPUs, or DeepSpeed.
322+
The :class:`~torch.optim.LBFGS` optimizer is not supported for AMP, IPUs, or DeepSpeed.

src/lightning_lite/connector.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
from lightning_lite.plugins import (
2727
CheckpointIO,
2828
DeepSpeedPrecision,
29-
NativeMixedPrecision,
29+
MixedPrecision,
3030
Precision,
3131
TPUBf16Precision,
3232
TPUPrecision,
@@ -452,7 +452,7 @@ def _check_and_init_precision(self) -> Precision:
452452
)
453453
return TPUBf16Precision()
454454
if isinstance(self.strategy, DeepSpeedStrategy):
455-
return DeepSpeedPrecision(self._precision_input, amp_type="native", amp_level=None) # type: ignore
455+
return DeepSpeedPrecision(self._precision_input) # type: ignore
456456

457457
if self._precision_input == 32:
458458
return Precision()
@@ -476,7 +476,7 @@ def _check_and_init_precision(self) -> Precision:
476476

477477
if isinstance(self.strategy, FSDPStrategy):
478478
return FSDPPrecision(precision=self._precision_input, device=device)
479-
return NativeMixedPrecision(precision=self._precision_input, device=device)
479+
return MixedPrecision(precision=self._precision_input, device=device)
480480

481481
raise RuntimeError("No precision set")
482482

0 commit comments

Comments
 (0)