Lightning-AI
diff --git a/‎.azure/gpu-tests-pytorch.yml‎
Lines changed: 0 additions & 2 deletions b/‎.azure/gpu-tests-pytorch.yml‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎dockers/base-cuda/Dockerfile‎
Lines changed: 0 additions & 13 deletions b/‎dockers/base-cuda/Dockerfile‎
Lines changed: 0 additions & 13 deletions
diff --git a/‎docs/source-pytorch/accelerators/gpu_intermediate.rst‎
Lines changed: 0 additions & 114 deletions b/‎docs/source-pytorch/accelerators/gpu_intermediate.rst‎
Lines changed: 0 additions & 114 deletions
diff --git a/‎docs/source-pytorch/api_references.rst‎
Lines changed: 0 additions & 1 deletion b/‎docs/source-pytorch/api_references.rst‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎docs/source-pytorch/extensions/strategy.rst‎
Lines changed: 0 additions & 3 deletions b/‎docs/source-pytorch/extensions/strategy.rst‎
Lines changed: 0 additions & 3 deletions
diff --git a/‎requirements/pytorch/check-avail-strategies.py‎
Lines changed: 0 additions & 1 deletion b/‎requirements/pytorch/check-avail-strategies.py‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎src/lightning/pytorch/plugins/environments/__init__.py‎
Lines changed: 0 additions & 1 deletion b/‎src/lightning/pytorch/plugins/environments/__init__.py‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎src/lightning/pytorch/plugins/environments/bagua_environment.py‎
Lines changed: 0 additions & 62 deletions b/‎src/lightning/pytorch/plugins/environments/bagua_environment.py‎
Lines changed: 0 additions & 62 deletions
diff --git a/‎src/lightning/pytorch/strategies/__init__.py‎
Lines changed: 0 additions & 1 deletion b/‎src/lightning/pytorch/strategies/__init__.py‎
Lines changed: 0 additions & 1 deletion
@@ -112,8 +112,6 @@ jobs:
 
     - bash: |
         set -e
-        CUDA_VERSION_BAGUA=$(python -c "print([ver for ver in [116,113,111,102] if $CUDA_VERSION_MM >= ver][0])")
-        pip install "bagua-cuda$CUDA_VERSION_BAGUA"
 
         CUDA_VERSION_MM_COLOSSALAI=$(python -c "import torch ; print(''.join(map(str, torch.version.cuda)))")
         CUDA_VERSION_COLOSSALAI=$(python -c "print([ver for ver in [11.3, 11.1] if $CUDA_VERSION_MM_COLOSSALAI >= ver][0])")
 
@@ -98,19 +98,6 @@ RUN \
     pip install -r requirements/pytorch/base.txt --no-cache-dir --find-links https://download.pytorch.org/whl/cu${CUDA_VERSION_MM}/torch_stable.html && \
     rm assistant.py
 
-
-RUN \
-    # install Bagua
-    if [[ $PYTORCH_VERSION != "1.13" ]]; then \
-        CUDA_VERSION_MM=$(python -c "print(''.join('$CUDA_VERSION'.split('.')[:2]))") ; \
-        CUDA_VERSION_BAGUA=$(python -c "print([ver for ver in [116,113,111,102] if $CUDA_VERSION_MM >= ver][0])") ; \
-        pip install "bagua-cuda$CUDA_VERSION_BAGUA" ; \
-        if [[ "$CUDA_VERSION_MM" = "$CUDA_VERSION_BAGUA" ]]; then \
-          python -c "import bagua_core; bagua_core.install_deps()"; \
-        fi ; \
-        python -c "import bagua; print(bagua.__version__)"; \
-    fi
-
 RUN \
     # install ColossalAI
     # TODO: 1.13 wheels are not released, remove skip once they are
 
@@ -25,7 +25,6 @@ Lightning supports multiple ways of doing distributed training.
     - Regular (``strategy='ddp'``)
     - Spawn (``strategy='ddp_spawn'``)
     - Notebook/Fork (``strategy='ddp_notebook'``)
-- Bagua (``strategy='bagua'``) (multiple-gpus across many machines with advanced training algorithms)
 
 .. note::
     If you request multiple GPUs or nodes without setting a mode, DDP Spawn will be automatically used.
@@ -235,119 +234,6 @@ Comparison of DDP variants and tradeoffs
      - Fast
 
 
-Bagua
-^^^^^
-`Bagua <https://github.com/BaguaSys/bagua>`_ is a deep learning training acceleration framework which supports
-multiple advanced distributed training algorithms including:
-
-- `Gradient AllReduce <https://tutorials.baguasys.com/algorithms/gradient-allreduce>`_ for centralized synchronous communication, where gradients are averaged among all workers.
-- `Decentralized SGD <https://tutorials.baguasys.com/algorithms/decentralized>`_ for decentralized synchronous communication, where each worker exchanges data with one or a few specific workers.
-- `ByteGrad <https://tutorials.baguasys.com/algorithms/bytegrad>`_ and `QAdam <https://tutorials.baguasys.com/algorithms/q-adam>`_ for low precision communication, where data is compressed into low precision before communication.
-- `Asynchronous Model Average <https://tutorials.baguasys.com/algorithms/async-model-average>`_ for asynchronous communication, where workers are not required to be synchronized in the same iteration in a lock-step style.
-
-By default, Bagua uses *Gradient AllReduce* algorithm, which is also the algorithm implemented in DDP,
-but Bagua can usually produce a higher training throughput due to its backend written in Rust.
-
-.. code-block:: python
-
-    # train on 4 GPUs (using Bagua mode)
-    trainer = Trainer(strategy="bagua", accelerator="gpu", devices=4)
-
-
-By specifying the ``algorithm`` in the ``BaguaStrategy``, you can select more advanced training algorithms featured by Bagua:
-
-
-.. code-block:: python
-
-    # train on 4 GPUs, using Bagua Gradient AllReduce algorithm
-    trainer = Trainer(
-        strategy=BaguaStrategy(algorithm="gradient_allreduce"),
-        accelerator="gpu",
-        devices=4,
-    )
-
-    # train on 4 GPUs, using Bagua ByteGrad algorithm
-    trainer = Trainer(
-        strategy=BaguaStrategy(algorithm="bytegrad"),
-        accelerator="gpu",
-        devices=4,
-    )
-
-    # train on 4 GPUs, using Bagua Decentralized SGD
-    trainer = Trainer(
-        strategy=BaguaStrategy(algorithm="decentralized"),
-        accelerator="gpu",
-        devices=4,
-    )
-
-    # train on 4 GPUs, using Bagua Low Precision Decentralized SGD
-    trainer = Trainer(
-        strategy=BaguaStrategy(algorithm="low_precision_decentralized"),
-        accelerator="gpu",
-        devices=4,
-    )
-
-    # train on 4 GPUs, using Asynchronous Model Average algorithm, with a synchronization interval of 100ms
-    trainer = Trainer(
-        strategy=BaguaStrategy(algorithm="async", sync_interval_ms=100),
-        accelerator="gpu",
-        devices=4,
-    )
-
-To use *QAdam*, we need to initialize
-`QAdamOptimizer <https://bagua.readthedocs.io/en/latest/autoapi/bagua/torch_api/algorithms/q_adam/index.html#bagua.torch_api.algorithms.q_adam.QAdamOptimizer>`_ first:
-
-.. code-block:: python
-
-    from pytorch_lightning.strategies import BaguaStrategy
-    from bagua.torch_api.algorithms.q_adam import QAdamOptimizer
-
-
-    class MyModel(pl.LightningModule):
-        ...
-
-        def configure_optimizers(self):
-            # initialize QAdam Optimizer
-            return QAdamOptimizer(self.parameters(), lr=0.05, warmup_steps=100)
-
-
-    model = MyModel()
-    trainer = Trainer(
-        accelerator="gpu",
-        devices=4,
-        strategy=BaguaStrategy(algorithm="qadam"),
-    )
-    trainer.fit(model)
-
-Bagua relies on its own `launcher <https://tutorials.baguasys.com/getting-started/#launch-job>`_ to schedule jobs.
-Below, find examples using ``bagua.distributed.launch`` which follows ``torch.distributed.launch`` API:
-
-.. code-block:: bash
-
-    # start training with 8 GPUs on a single node
-    python -m bagua.distributed.launch --nproc_per_node=8 train.py
-
-If the ssh service is available with passwordless login on each node, you can launch the distributed job on a
-single node with ``baguarun`` which has a similar syntax as ``mpirun``. When staring the job, ``baguarun`` will
-automatically spawn new processes on each of your training node provided by ``--host_list`` option and each node in it
-is described as an ip address followed by a ssh port.
-
-.. code-block:: bash
-
-    # Run on node1 (or node2) to start training on two nodes (node1 and node2), 8 GPUs per node
-    baguarun --host_list hostname1:ssh_port1,hostname2:ssh_port2 --nproc_per_node=8 --master_port=port1 train.py
-
-
-.. note:: You can also start training in the same way as Distributed Data Parallel. However, system optimizations like
-    `Bagua-Net <https://tutorials.baguasys.com/more-optimizations/bagua-net>`_ and
-    `Performance autotuning <https://tutorials.baguasys.com/performance-autotuning/>`_ can only be enabled through bagua
-    launcher. It is worth noting that with ``Bagua-Net``, Distributed Data Parallel can also achieve
-    better performance without modifying the training script.
-
-
-See `Bagua Tutorials <https://tutorials.baguasys.com/>`_ for more details on installation and advanced features.
-
-
 DP caveats
 ^^^^^^^^^^
 In DP each GPU within a machine sees a portion of a batch.
 
@@ -213,7 +213,6 @@ strategies
     :nosignatures:
     :template: classtemplate.rst
 
-    BaguaStrategy
     ColossalAIStrategy
     DDPSpawnStrategy
     DDPStrategy
 
@@ -69,9 +69,6 @@ The below table lists all relevant strategies available in Lightning with their
    * - Name
      - Class
      - Description
-   * - bagua
-     - :class:`~pytorch_lightning.strategies.BaguaStrategy`
-     - Strategy for training using the Bagua library, with advanced distributed training algorithms and system optimizations. :ref:`Learn more. <accelerators/gpu_intermediate:Bagua>`
    * - colossalai
      - :class:`~pytorch_lightning.strategies.ColossalAIStrategy`
      - Colossal-AI provides a collection of parallel components for you. It aims to support you to write your distributed deep learning models just like how you write your model on your laptop. `Learn more. <https://www.colossalai.org/>`__
 
@@ -1,3 +1,2 @@
 if __name__ == "__main__":
-    import bagua  # noqa: F401
     import deepspeed  # noqa: F401
@@ -21,4 +21,3 @@
     TorchElasticEnvironment,
     XLAEnvironment,
 )
-from lightning.pytorch.plugins.environments.bagua_environment import BaguaEnvironment  # noqa: F401
@@ -12,7 +12,6 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 from lightning.fabric.strategies.registry import _StrategyRegistry
-from lightning.pytorch.strategies.bagua import BaguaStrategy  # noqa: F401
 from lightning.pytorch.strategies.colossalai import ColossalAIStrategy  # noqa: F401
 from lightning.pytorch.strategies.ddp import DDPStrategy  # noqa: F401
 from lightning.pytorch.strategies.ddp_spawn import DDPSpawnStrategy  # noqa: F401
Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,2 @@`
`1`	`1`	`if __name__ == "__main__":`
`2`		`- import bagua # noqa: F401`
`3`	`2`	`import deepspeed # noqa: F401`
Original file line number	Diff line number	Diff line change
`@@ -21,4 +21,3 @@`
`21`	`21`	`TorchElasticEnvironment,`
`22`	`22`	`XLAEnvironment,`
`23`	`23`	`)`
`24`		`-from lightning.pytorch.plugins.environments.bagua_environment import BaguaEnvironment # noqa: F401`