forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 7
prefill metadata splitting #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
LucasWilkinson
wants to merge
307
commits into
sage/dbo-full-cudagraphs
Choose a base branch
from
lwilkinson/dbo-prefill
base: sage/dbo-full-cudagraphs
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
+39,522
−65,238
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Prashant Gupta <[email protected]>
Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]> Signed-off-by: Woosuk Kwon <[email protected]>
Signed-off-by: Roger Wang <[email protected]> Signed-off-by: Isotr0py <[email protected]> Co-authored-by: Huang Jie <[email protected]> Co-authored-by: 松灵 <[email protected]> Co-authored-by: Isotr0py <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
…ect#24649) Signed-off-by: Haoyang Li <[email protected]> Co-authored-by: Haoyang Li <[email protected]>
…mark_serving_multi_turn) (vllm-project#23255) Signed-off-by: daniels <[email protected]>
Signed-off-by: Kunshang Ji <[email protected]>
Signed-off-by: rouchenzi <[email protected]> Signed-off-by: rouchenzi <[email protected]> Co-authored-by: Bowen Wang <[email protected]>
…llm-project#24969) Signed-off-by: Lukas Geiger <[email protected]>
Signed-off-by: whx-sjtu <[email protected]>
Signed-off-by: whx-sjtu <[email protected]>
Signed-off-by: Zhuohan Li <[email protected]>
Signed-off-by: windsonsea <[email protected]>
Signed-off-by: Xinyu Chen <[email protected]> Co-authored-by: Kunshang Ji <[email protected]>
Signed-off-by: chaunceyjiang <[email protected]>
…tract_tool_call_required_streaming (vllm-project#24668) Signed-off-by: Shijun Yin <[email protected]>
…#25065) Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: Daniel Afrimi <[email protected]> Co-authored-by: root <[email protected]>
…vllm-project#25046) Signed-off-by: jiang1.li <[email protected]>
Signed-off-by: Aidyn-A <[email protected]>
Signed-off-by: Dylan Maloy <[email protected]> Co-authored-by: Jee Jee Li <[email protected]>
…mentation. (vllm-project#24957) Signed-off-by: Tao He <[email protected]>
…d warning. (vllm-project#25010) Signed-off-by: samzong <[email protected]>
Signed-off-by: Matthew Bonanni <[email protected]> Signed-off-by: Matthew Bonanni <[email protected]>
…project#24970) Signed-off-by: samzong <[email protected]>
Signed-off-by: mgoin <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]> Co-authored-by: Jee Jee Li <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Sara Kokkila Schumacher <[email protected]>
…3993) Signed-off-by: Csrayz <[email protected]> Signed-off-by: ivyilike <[email protected]> Co-authored-by: ivyilike <[email protected]>
…e.py Co-authored-by: Tyler Michael Smith <[email protected]> Signed-off-by: Lucas Wilkinson <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
…ckends (vllm-project#24648) Signed-off-by: Burkhard Ringlein <[email protected]>
Signed-off-by: Bowen Wang <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Tyler Michael Smith <[email protected]>
…v variables (vllm-project#25274) Signed-off-by: qqma <[email protected]> Signed-off-by: DarkLight1337 <[email protected]> Co-authored-by: qqma <[email protected]> Co-authored-by: Cyrus Leung <[email protected]>
…ct#25391) Signed-off-by: ElizaWszola <[email protected]>
Signed-off-by: mgoin <[email protected]>
…roject#24899) Signed-off-by: Lu Fang <[email protected]> Signed-off-by: Zhuohan Li <[email protected]> Co-authored-by: Zhuohan Li <[email protected]>
…g utils, fix DCE bug (vllm-project#23091), fix test (vllm-project#24376), and prep for custom op matching (vllm-project#24604) (vllm-project#24542) Signed-off-by: Luka Govedič <[email protected]> Signed-off-by: luka <[email protected]> Signed-off-by: Luka Govedič <[email protected]>
Signed-off-by: Or Ozeri <[email protected]>
…ly released (vllm-project#25394) Signed-off-by: DarkLight1337 <[email protected]>
…t#25278) Signed-off-by: Johnny Yang <[email protected]> Co-authored-by: Chengji Yao <[email protected]>
…ough headers (vllm-project#24628) Signed-off-by: Alec Solder <[email protected]> Signed-off-by: Alec S <[email protected]> Co-authored-by: Alec Solder <[email protected]> Co-authored-by: Ye (Charlotte) Qi <[email protected]>
Signed-off-by: Luka Govedič <[email protected]>
Signed-off-by: Russell Bryant <[email protected]>
Signed-off-by: NickLucche <[email protected]>
Signed-off-by: Matthew Bonanni <[email protected]> Co-authored-by: Robert Shaw <[email protected]> Co-authored-by: Chris Bamford <[email protected]>
…tput handling (vllm-project#25184) Signed-off-by: Alexander Matveev <[email protected]>
…24611) Signed-off-by: yewentao256 <[email protected]>
…t#25410) Signed-off-by: Isotr0py <[email protected]>
…#25409) Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: yewentao256 <[email protected]>
Signed-off-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Lucas Wilkinson <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.