[Feature]: Plugin framework for out-of-tree custom checkpoint loading

### 🚀 The feature, motivation and pitch

### Motivation
Currently, vLLM primarily supports HuggingFace-format checkpoints along with a few built-in custom loaders such as mistral. However, some proprietary reinforcement learning (RL) systems use custom checkpoint and weight formats along with unique loading mechanisms, which creates challenges for adopting vLLM in these environments. Support is required in two key areas (they share some common path):
1. Initial custom weight loading during rollout engine initialization
2. Loading updated custom weights from the actor

### Our attempt
We have developed a heavily modified version of vLLM to enable custom checkpoint loading for a specific proprietary format. While this solution works, a clean and maintainable implementation is necessary for production use. Unfortunately, we cannot open-source the internal implementation details or specifications.

### Proposal
We propose to build a fully out-of-tree plugin framework to support custom checkpoint/weights loading. This issue is about this OSS part. On top of the plugin framework, we will develop the actual plugins internally. Based on our experience with the hacked version, the following components are essential:

- [x] Custom model plugin system: https://docs.vllm.ai/en/v0.10.0/contributing/model/registration.html#out-of-tree-models
- [x] Custom tokenizer plugin system #12518
- [x] Custom loader plugin system #21067
- [ ] Custom configuration plugin system
  - #24384 
  - #24277
- [ ] Custom processor plugin system
- [ ] TBD: A few more HuggingFace decoupling points may be required

### How to measure the success

1. Thorough testing within our proprietary RL system
2. As a proof of concept, we could convert one of the built-in loaders, such as mistral (#8168), into an out-of-tree plugin.

### The future: In-tree or plugins?
For discussion: Do we want to maintain 3rd party loaders in-tree or gradually move them out of tree? What is the criteria to decide what is in tree and what should become plugins?

### Alternatives

Alternative: Develop the custom loading in-tree via internal directory branching without open sourcing it. This option has some major downsides - for example, vLLM model loading is currently highly coupled with `transformers` library. Adopting this approach means also maintaining an internal version of some major configuration/loader components.

### Additional context

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Plugin framework for out-of-tree custom checkpoint loading #23009

🚀 The feature, motivation and pitch

Motivation

Our attempt

Proposal

How to measure the success

The future: In-tree or plugins?

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Plugin framework for out-of-tree custom checkpoint loading #23009

Description

🚀 The feature, motivation and pitch

Motivation

Our attempt

Proposal

How to measure the success

The future: In-tree or plugins?

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions