-
-
Notifications
You must be signed in to change notification settings - Fork 10.1k
[Core][Hybrid allocator + connector] Support hybrid allocator + kv cache connector #23624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Core][Hybrid allocator + connector] Support hybrid allocator + kv cache connector #23624
Conversation
Signed-off-by: KuntaiDu <[email protected]>
…o GPU memory, the inference results are wrong. Fix this first. Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: KuntaiDu <[email protected]>
Warning Gemini encountered an error creating the review. You can try again by commenting |
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
…onnector Signed-off-by: KuntaiDu <[email protected]>
…KuntaiDu/vllm into kuntai-support-hybrid-allocator Signed-off-by: KuntaiDu <[email protected]>
Will take deeper look later |
This pull request has merge conflicts that must be resolved before it can be |
Co-authored-by: Chen Zhang <[email protected]> Signed-off-by: Kuntai Du <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
…KuntaiDu/vllm into kuntai-support-hybrid-allocator Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some random idea to discuss
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
Signed-off-by: KuntaiDu <[email protected]>
[Core][Hybrid allocator + connector] Support hybrid allocator + kv cache connector
Checklist at the bottom is considered.
Purpose
This PR aims to support hybrid allocator + kv cache connector code path.
Design doc: link
Related to #23079
Solves #22292
Test Plan
Local correctness test passed. Will further work on instructions to let other people reproduce.
Core test logic:
Test Result
For the last request:
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.