-
Notifications
You must be signed in to change notification settings - Fork 7
Nixl optimization for llama4 local attention #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…aft model to free ~1GB for llama 3 model (vllm-project#17326) Co-authored-by: root <[email protected]> Co-authored-by: Woosuk Kwon <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
Co-authored-by: Aaron Pham <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Aaron Pham <[email protected]> Co-authored-by: Russell Bryant <[email protected]>
Signed-off-by: mgoin <[email protected]>
…-project#17826) Signed-off-by: Jerry Zhang <[email protected]>
) Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: mgoin <[email protected]> Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Nick Hill <[email protected]>
…ct#17945) Signed-off-by: Chen Zhang <[email protected]>
Signed-off-by: Mark McLoughlin <[email protected]>
Signed-off-by: Aaron Pham <[email protected]>
Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]>
…m-project#18154) Signed-off-by: Luka Govedič <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
…llm-project#18013) Signed-off-by: Thomas Parnell <[email protected]> Co-authored-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Andy Xie <[email protected]>
Signed-off-by: inkcherry <[email protected]>
…llm-project#18178) Signed-off-by: Mengqing Cao <[email protected]>
Signed-off-by: David Xia <[email protected]>
Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: omahs <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Lucia Fang <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
…-project#18229) Signed-off-by: Lucas Wilkinson <[email protected]>
…attention on ROCm (vllm-project#18093) Signed-off-by: kf <[email protected]>
Signed-off-by: lisiqi23 <[email protected]> Signed-off-by: skylee-01 <[email protected]> Co-authored-by: lisiqi23 <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
…llm-project#18209) Signed-off-by: Will Eaton <[email protected]>
…ce for V1 (vllm-project#17827) Signed-off-by: Lucia Fang <[email protected]>
Signed-off-by: David Xia <[email protected]>
vllm-project#17973) Signed-off-by: Vadim Gimpelson <[email protected]>
Signed-off-by: Seiji Eicher <[email protected]>
vllm-project#18214) Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Felix Marty <[email protected]>
Signed-off-by: learner0810 <[email protected]>
Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]>
Signed-off-by: Nick Hill <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: mgoin <[email protected]>
This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you! |
This pull request has been automatically closed due to inactivity. Please feel free to reopen if you intend to continue working on it. Thank you! |
No description provided.