-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
[Frontend] Support tool calling and reasoning parser #14511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Frontend] Support tool calling and reasoning parser #14511
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
3cef121
to
df45cb1
Compare
df45cb1
to
d1a57e7
Compare
Thanks for the PR. I will help review it. FYI I had a refactor for the reasoning support #14428, not sure if you can use |
I have reviewed PR #14428, and I think we can reuse the is_reasoning_end function. After your review, I will proceed with the refactoring. |
vllm/entrypoints/openai/reasoning_parsers/abs_reasoning_parsers.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The streaming logic looks good to me, but does it also support full generation?
Besides this, we could also add some example code and tests
And we should:
|
Yes, It support full generation, I will add some example code and tests. |
OK |
d1a57e7
to
96558b1
Compare
01e43db
to
0b9cc13
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can merge this before the refactor, I could rebase in that PR.
0b9cc13
to
62043bd
Compare
I think this pr can be ready to merge. @gaocegege @russellb |
62043bd
to
0cc007f
Compare
Will this request be included in version 0.7.4? When will it be released? |
@szpnygo I hope so. |
16e984e
to
4b96d9c
Compare
can you change the title to "[Frontend] Support tool calling and reasoning parser" |
OK, done |
…ng models Signed-off-by: WangErXiao <[email protected]>
Signed-off-by: WangErXiao <[email protected]>
Signed-off-by: WangErXiao <[email protected]>
Signed-off-by: WangErXiao <[email protected]>
Signed-off-by: WangErXiao <[email protected]>
Signed-off-by: WangErXiao <[email protected]>
2e87f90
to
47a7c16
Compare
@robertgshaw2-redhat @mgoin Could you please take a look? It blocks some other PRs about the reasoning parser. |
cc @simon-mo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when streaming request model the output tool message will be incompleted
It's maybe tool parser problem.
to
|
) Signed-off-by: WangErXiao <[email protected]>
) Signed-off-by: WangErXiao <[email protected]> Signed-off-by: Wes Medford <[email protected]>
) Signed-off-by: WangErXiao <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>
) Signed-off-by: WangErXiao <[email protected]>
) Signed-off-by: WangErXiao <[email protected]>
) Signed-off-by: WangErXiao <[email protected]> Signed-off-by: Mu Huai <[email protected]>
Remove the mutual exclusion restriction between tool calling and reasoning parser, as models like QwQ-32B can support both functionalities simultaneously. Additionally, having tool calling parse only from the content rather than the reasoning_content can improve the accuracy and performance of tool calling.
This PR resolves the issue #14490