slime

中文版

slime is an LLM post-training framework for RL scaling, providing two core capabilities:

High-Performance Training: Supports efficient training in various modes by connecting Megatron with SGLang;
Flexible Data Generation: Enables arbitrary training data generation workflows through custom data generation interfaces and server-based engines.

Blogs

Our vision: slime: An SGLang-Native Post-Training Framework for RL Scaling.
Our ideas on agentic training: Agent-Oriented Design: An Asynchronous and Decoupled Framework for Agentic RL.
slime has served as the RL framework for GLM-4.5: GLM-4.5: Reasoning, Coding, and Agentic Abililties

Architecture Overview

Module Descriptions:

training (Megatron): Responsible for the main training process, reads data from the Data Buffer, and synchronizes parameters to the rollout module after training.
rollout (SGLang + router): Generates new data (including rewards/verifier outputs) and stores it in the Data Buffer.
data buffer: A bridge module that manages prompt initialization, custom data, and rollout generation methods.

Quick Start

For a comprehensive quick start guide covering environment setup, data preparation, training startup, and key code analysis, please refer to:

Quick Start Guide

We also provides examples for some usecases not covered in the quick start guide, please check examples.

Arguments Walk Through

Arguments in slime are divided into three categories:

Megatron arguments: slime reads all arguments set in Megatron via PYTHONPATH. You can configure Megatron by passing arguments like --tensor-model-parallel-size 2.
SGLang arguments: All arguments for the installed SGLang are supported. These arguments must be prefixed with --sglang-. For example, --mem-fraction-static should be passed as --sglang-mem-fraction-static.
slime-specific arguments: Please refer to: slime/utils/arguments.py

For complete usage instructions, please refer to the Usage Documentation.

Developer Guide

Contributions are welcome! If you have suggestions for new features, performance tuning, or feedback on user experience, feel free to submit an Issue or PR 😊
Use pre-commit to ensure code style consistency for your commits:
```
apt install pre-commit -y
pre-commit install
```
For debugging tips, please refer to the Debugging Guide

FAQ & Acknowledgements

For frequently asked questions, please see the Q&A
Special thanks to the following projects & communities: SGLang, Megatron‑LM, mbridge, OpenRLHF, veRL, Pai-Megatron-Patch and others.

Name		Name	Last commit message	Last commit date
Latest commit History 394 Commits
.github/workflows		.github/workflows
docker		docker
docs		docs
examples		examples
imgs		imgs
scripts		scripts
slime		slime
slime_plugins		slime_plugins
tests		tests
tools		tools
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
build_conda.sh		build_conda.sh
install_megatron_core.sh		install_megatron_core.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py
train_async.py		train_async.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

slime

Blogs

Table of Contents

Architecture Overview

Quick Start

Arguments Walk Through

Developer Guide

FAQ & Acknowledgements

About

Uh oh!

Releases 1

Packages

Contributors 39

Languages

License

THUDM/slime

Folders and files

Latest commit

History

Repository files navigation

slime

Blogs

Table of Contents

Architecture Overview

Quick Start

Arguments Walk Through

Developer Guide

FAQ & Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 39

Languages

Packages