Simplify and relax dependencies #456

EricHallahan · 2021-11-06T02:24:58Z

The requirements files specified in ./requirements have historically been strict as to prevent CI Docker images changing without our prior knowledge. However, this places a burden on users who would like to run GPT-NeoX on a host without containerization, and many non-critical packages are unnecessarily strictly specified as well. This motivates the following changes in this PR:

Loosen the dependency version requirements for non-critical packages
Remove the unnecessary einops and mpi4py packages
Make the installation of wandb optional, making installation accessible through a new requirements file ./requirements/requirements-wandb.txt
Fix an oversight that made the Megatron GPT-2 tokenizer inaccessible
Clean up the deepy.py DeepSpeed launcher script

…plify-dependencies

- Remove `mpi4py` and `einops` - Update `numpy`, `wandb`, `transformers`, `lm_dataformat` and `ftfy`

- Revert `lm_dataformat` version back down to 0.0.19

- Update requirements files to specify ranges of package versions - Make `wandb` optional

Stella athena patch 1 1

Restores inoperability with pre-Volta hardware.

This reverts commit 21ba55c.

deepy.py

StellaAthena · 2021-11-07T15:54:22Z

This looks good to me. @EricHallahan’s explanation of how he decided which imports go inside main() seem like a reasonable rule of thumb, and I like the changes to the requirements system in general.

I haven’t run this myself, as I’m under the impression that you’ve been testing it extensively.

sdtblck · 2021-11-07T15:58:07Z

Aside from the above ^ lgtm 🚀

StellaAthena · 2021-11-08T03:38:51Z

@EricHallahan is there something you’re waiting on to merge this?

EricHallahan · 2021-11-08T03:50:37Z

I am working on verifying that the behavior with regards to missing Weights & Biases dependencies is what I intended/makes sense, and I also need to add corresponding documentation. I expect to have this ready to merge sometime early tomorrow.

Eliminate the usage of `shortuuid`

- Fix CITATION.cff - Update README.md to reflect changes to wandb installation - Remove `shortuuid` from requirements-wandb.txt

…/gpt-neox into simplify-dependencies

Enable wandb by default for the EleutherAI cluster.

sdtblck · 2021-11-07T15:55:09Z

megatron/fused_kernels/setup.py

@@ -22,14 +22,9 @@ def _get_cuda_bare_metal_version(cuda_dir):
 srcpath = Path(__file__).parent.absolute()
 cc_flag = []
 _, bare_metal_major, _ = _get_cuda_bare_metal_version(cpp_extension.CUDA_HOME)
-if int(bare_metal_major) >= 11:


can you explain what this change is doing?

This code forces building sm_70 (Volta) and sm_80 (Ampere) and no other architectures. This means the built extensions will fail to execute on older architectures like Kepler, Maxwell, or Pascal. By default CUDAExtension will build for the local hardware and so therefore it is safe to remove this code. Compiling for non-local hardware is as easy as setting TORCH_CUDA_ARCH_LIST in the environment prior to running the script should it be needed.

Ok, sounds good. Did you test the kernels built with this change

still work the same on the A100s, and

work on older architectures?

Re: the TORCH_CUDA_ARCH_LIST thing, should we add a comment saying as such?

I have tested it many times and have not seen any issues, and yes, we should absolutely add documentation about what to do if you need to force the arch.

sdtblck · 2021-11-07T15:55:38Z

megatron/logging.py

+try:
+    import wandb
+except ModuleNotFoundError:
+    pass


Will wandb not being present not cause some errors later in training?

If wandb is not installed it sets use_wandb to False. All subsequent Weights & Biases code relies on use_wandb being True, and therefore it never executes anything imported from wandb.

sdtblck

I'm not sure of the motivation for the changed to fused_kernels.py
Also, I really don't like the requirements being so granular like this. I don't see the need for separate requirements files for wandb / tensorboard

StellaAthena · 2021-11-09T22:05:40Z

I'm not sure of the motivation for the changed to fused_kernels.py Also, I really don't like the requirements being so granular like this. I don't see the need for separate requirements files for wandb / tensorboard

If someone doesn't have WandB available and doesn't wish to use it, how would you prefer they proceed?

sdtblck · 2021-11-09T22:10:37Z

I'm not sure of the motivation for the changed to fused_kernels.py Also, I really don't like the requirements being so granular like this. I don't see the need for separate requirements files for wandb / tensorboard

If someone doesn't have WandB available and doesn't wish to use it, how would you prefer they proceed?

The reason sparse attention and onebitadam were separated out is because they're optional dependencies which are also a bit of a pain to install (cupy-cuda requires you specify the cuda version, and triton used to break often), so removing them from requirements.txt reduced complexity for most users.

As far as I'm aware - there are no such problems with the installation of wandb. You can just pip install it. Including it in requirements.txt does nothing more than take up a few kb more space on the user's device.

Including it in a separate file may mean someone has to take a few minutes to figure out why their logging isn't working, and go back and realise that it's actually not in requirements.txt and you need to install it separately. I can't see a counter scenario where it would actually save time / decrease complexity.

I don't know why / how requirements/requirements-tensorboard.txt ever became a separete file. @sweinbach any ideas?

EricHallahan · 2021-11-09T22:42:10Z

The only reason why I separated it out is because it mirrored how TensorBoard was handled. If we don't think that makes sense I'm happy to change it.

A counterargument to instructing users to "just install wandb/tensorboard" is versioning, which is the problem that requirements files are designed to solve. If you tell the user to install the requirements file, at least you can narrow down the environment to what is in the file rather than having to take a guess at what version that the package manager chose.

sweinbach · 2021-11-10T06:14:31Z

Maybe too much for this PR but related. Any reason not to use pip-compile to create the requirements file and actually fix dependencies?

EricHallahan · 2021-11-10T18:39:55Z

Maybe too much for this PR but related. Any reason not to use pip-compile to create the requirements file and actually fix dependencies?

I had originally planned this PR with a larger scope which included more granular dependency management (such as only installing the dependencies for evaluation if the user specifies they are interested in evaluation) and a setuptools script to manage that system (which would have also registered deepy.py as a console script for convenience). It however became a point of contention whether this was worth the potential confusion such a system could create, not to mention the increased complexity. I ultimately ended up stashing that line of work so that we could integrate the important changes, but if a more advanced dependency management workflow is desired it would not be hard to continue that line of work.

StellaAthena · 2022-10-07T03:28:43Z

Closing as it’s unfixably far behind and better done from scratch

StellaAthena and others added 23 commits September 1, 2021 23:47

Updates LM Eval Harness version to match recent release

4d58432

Update requirements.txt

7fcfc88

Merge remote-tracking branch 'origin/StellaAthena-patch-1-1' into sim…

8c11365

…plify-dependencies

Dependency maintenance

9e7020d

- Remove `mpi4py` and `einops` - Update `numpy`, `wandb`, `transformers`, `lm_dataformat` and `ftfy`

Remove OpenMPI build

7f5746f

Play nice with lm_eval

d15fd1a

- Revert `lm_dataformat` version back down to 0.0.19

Lenient package versions

9a8e441

- Update requirements files to specify ranges of package versions - Make `wandb` optional

Fix requirements ranges

7e7af82

Make Hugging Face tokenizers optional

a06b735

Merge branch 'main' into StellaAthena-patch-1-1

e35aa25

Added text generation to readme

d9fa4df

Merge branch 'simplify-dependencies' into StellaAthena-patch-1-1

f310acc

Merge pull request #421 from EleutherAI/StellaAthena-patch-1-1

fc24a47

Stella athena patch 1 1

Repair wandb fallback

78ecedc

Remove NVCC gencode flags

a4df543

Restores inoperability with pre-Volta hardware.

Merge branch 'main' into simplify-dependencies

43baf81

Fix wandb code

3f98c5c

Update Dockerfile

21ba55c

Merge branch 'main' into simplify-dependencies

33fa71c

Revert "Update Dockerfile"

1e2e52f

This reverts commit 21ba55c.

Revert "Remove OpenMPI build"

5bfd0cc

Further relax requirements

b478638

Add shortuuid requirement for wandb

31d58ff

EricHallahan requested a review from a team as a code owner November 6, 2021 02:24

EricHallahan requested review from joshlk and sdtblck November 6, 2021 02:24

EricHallahan mentioned this pull request Nov 6, 2021

Handling multiple fields of the custom input data in the preprocess_data.py #455

Closed

StellaAthena reviewed Nov 6, 2021

View reviewed changes

deepy.py Show resolved Hide resolved

StellaAthena previously approved these changes Nov 7, 2021

View reviewed changes

Merge branch 'main' into simplify-dependencies

436e1f8

EricHallahan dismissed StellaAthena’s stale review via 436e1f8 November 7, 2021 16:09

StellaAthena previously approved these changes Nov 7, 2021

View reviewed changes

Do wandb detection right

ed7e1b4

Eliminate the usage of `shortuuid`

EricHallahan dismissed StellaAthena’s stale review via ed7e1b4 November 9, 2021 00:07

EricHallahan added 2 commits November 8, 2021 19:23

Small tweaks

341fcd1

- Fix CITATION.cff - Update README.md to reflect changes to wandb installation - Remove `shortuuid` from requirements-wandb.txt

Merge branch 'simplify-dependencies' of https://github.com/EleutherAI…

de53492

…/gpt-neox into simplify-dependencies

EricHallahan requested a review from StellaAthena November 9, 2021 00:28

Update eleutherai_cluster.yml

680385d

Enable wandb by default for the EleutherAI cluster.

EricHallahan removed the request for review from joshlk November 9, 2021 17:28

StellaAthena approved these changes Nov 9, 2021

View reviewed changes

sdtblck reviewed Nov 9, 2021

View reviewed changes

sdtblck requested changes Nov 9, 2021

View reviewed changes

StellaAthena closed this Oct 7, 2022

EricHallahan mentioned this pull request Mar 9, 2023

Simplify and relax dependencies (Take 2) #818

Merged

Simplify and relax dependencies #456

Simplify and relax dependencies #456

Uh oh!

Conversation

EricHallahan commented Nov 6, 2021

Uh oh!

Uh oh!

StellaAthena commented Nov 7, 2021

Uh oh!

sdtblck commented Nov 7, 2021

Uh oh!

StellaAthena commented Nov 8, 2021

Uh oh!

EricHallahan commented Nov 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sdtblck Nov 7, 2021

Choose a reason for hiding this comment

Uh oh!

EricHallahan Nov 9, 2021

Choose a reason for hiding this comment

Uh oh!

sdtblck Nov 9, 2021

Choose a reason for hiding this comment

Uh oh!

EricHallahan Nov 10, 2021

Choose a reason for hiding this comment

Uh oh!

sdtblck Nov 7, 2021

Choose a reason for hiding this comment

Uh oh!

EricHallahan Nov 10, 2021

Choose a reason for hiding this comment

Uh oh!

sdtblck left a comment

Choose a reason for hiding this comment

Uh oh!

StellaAthena commented Nov 9, 2021

Uh oh!

sdtblck commented Nov 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EricHallahan commented Nov 9, 2021

Uh oh!

sweinbach commented Nov 10, 2021

Uh oh!

EricHallahan commented Nov 10, 2021

Uh oh!

StellaAthena commented Oct 7, 2022

Uh oh!

Uh oh!

EricHallahan commented Nov 8, 2021 •

edited

Loading

sdtblck commented Nov 9, 2021 •

edited

Loading