Skip to content

THUDM/slime

Repository files navigation

slime

中文版

Documentation Ask DeepWiki

slime is an LLM post-training framework for RL scaling, providing two core capabilities:

  1. High-Performance Training: Supports efficient training in various modes by connecting Megatron with SGLang;
  2. Flexible Data Generation: Enables arbitrary training data generation workflows through custom data generation interfaces and server-based engines.

Blogs

Table of Contents

Architecture Overview

arch

Module Descriptions:

  • training (Megatron): Responsible for the main training process, reads data from the Data Buffer, and synchronizes parameters to the rollout module after training.
  • rollout (SGLang + router): Generates new data (including rewards/verifier outputs) and stores it in the Data Buffer.
  • data buffer: A bridge module that manages prompt initialization, custom data, and rollout generation methods.

Quick Start

For a comprehensive quick start guide covering environment setup, data preparation, training startup, and key code analysis, please refer to:

We also provides examples for some usecases not covered in the quick start guide, please check examples.

Arguments Walk Through

Arguments in slime are divided into three categories:

  1. Megatron arguments: slime reads all arguments set in Megatron via PYTHONPATH. You can configure Megatron by passing arguments like --tensor-model-parallel-size 2.
  2. SGLang arguments: All arguments for the installed SGLang are supported. These arguments must be prefixed with --sglang-. For example, --mem-fraction-static should be passed as --sglang-mem-fraction-static.
  3. slime-specific arguments: Please refer to: slime/utils/arguments.py

For complete usage instructions, please refer to the Usage Documentation.

Developer Guide

  • Contributions are welcome! If you have suggestions for new features, performance tuning, or feedback on user experience, feel free to submit an Issue or PR 😊

  • Use pre-commit to ensure code style consistency for your commits:

    apt install pre-commit -y
    pre-commit install
  • For debugging tips, please refer to the Debugging Guide

FAQ & Acknowledgements

  • For frequently asked questions, please see the Q&A
  • Special thanks to the following projects & communities: SGLang, Megatron‑LM, mbridge, OpenRLHF, veRL, Pai-Megatron-Patch and others.