slime is an LLM post-training framework for RL scaling, providing two core capabilities:
- High-Performance Training: Supports efficient training in various modes by connecting Megatron with SGLang;
- Flexible Data Generation: Enables arbitrary training data generation workflows through custom data generation interfaces and server-based engines.
- Our vision: slime: An SGLang-Native Post-Training Framework for RL Scaling.
- Our ideas on agentic training: Agent-Oriented Design: An Asynchronous and Decoupled Framework for Agentic RL.
- slime has served as the RL framework for GLM-4.5: GLM-4.5: Reasoning, Coding, and Agentic Abililties
- Architecture Overview
- Quick Start
- Checkpoint Format Conversion
- Starting the Training Process
- Argument Descriptions
- Developer Guide
- FAQ & Acknowledgements
Module Descriptions:
- training (Megatron): Responsible for the main training process, reads data from the Data Buffer, and synchronizes parameters to the rollout module after training.
- rollout (SGLang + router): Generates new data (including rewards/verifier outputs) and stores it in the Data Buffer.
- data buffer: A bridge module that manages prompt initialization, custom data, and rollout generation methods.
For a comprehensive quick start guide covering environment setup, data preparation, training startup, and key code analysis, please refer to:
We also provides examples for some usecases not covered in the quick start guide, please check examples.
Arguments in slime are divided into three categories:
- Megatron arguments: slime reads all arguments set in Megatron via
PYTHONPATH
. You can configure Megatron by passing arguments like--tensor-model-parallel-size 2
. - SGLang arguments: All arguments for the installed SGLang are supported. These arguments must be prefixed with
--sglang-
. For example,--mem-fraction-static
should be passed as--sglang-mem-fraction-static
. - slime-specific arguments: Please refer to: slime/utils/arguments.py
For complete usage instructions, please refer to the Usage Documentation.
-
Contributions are welcome! If you have suggestions for new features, performance tuning, or feedback on user experience, feel free to submit an Issue or PR 😊
-
Use pre-commit to ensure code style consistency for your commits:
apt install pre-commit -y pre-commit install
-
For debugging tips, please refer to the Debugging Guide
- For frequently asked questions, please see the Q&A
- Special thanks to the following projects & communities: SGLang, Megatron‑LM, mbridge, OpenRLHF, veRL, Pai-Megatron-Patch and others.