Skip to content

[Feature]: Plugin framework for out-of-tree custom checkpoint loading #23009

@22quinn

Description

@22quinn

🚀 The feature, motivation and pitch

Motivation

Currently, vLLM primarily supports HuggingFace-format checkpoints along with a few built-in custom loaders such as mistral. However, some proprietary reinforcement learning (RL) systems use custom checkpoint and weight formats along with unique loading mechanisms, which creates challenges for adopting vLLM in these environments. Support is required in two key areas (they share some common path):

  1. Initial custom weight loading during rollout engine initialization
  2. Loading updated custom weights from the actor

Our attempt

We have developed a heavily modified version of vLLM to enable custom checkpoint loading for a specific proprietary format. While this solution works, a clean and maintainable implementation is necessary for production use. Unfortunately, we cannot open-source the internal implementation details or specifications.

Proposal

We propose to build a fully out-of-tree plugin framework to support custom checkpoint/weights loading. This issue is about this OSS part. On top of the plugin framework, we will develop the actual plugins internally. Based on our experience with the hacked version, the following components are essential:

How to measure the success

  1. Thorough testing within our proprietary RL system
  2. As a proof of concept, we could convert one of the built-in loaders, such as mistral ([Model] Allow loading from original Mistral format #8168), into an out-of-tree plugin.

The future: In-tree or plugins?

For discussion: Do we want to maintain 3rd party loaders in-tree or gradually move them out of tree? What is the criteria to decide what is in tree and what should become plugins?

Alternatives

Alternative: Develop the custom loading in-tree via internal directory branching without open sourcing it. This option has some major downsides - for example, vLLM model loading is currently highly coupled with transformers library. Adopting this approach means also maintaining an internal version of some major configuration/loader components.

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature requestNew feature or requestrlRelated to RL workflows

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions