-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
[Core] Add torch profiler CPU traces for AsyncLLM. #21794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…, which processes MM data that can be CPU heavy. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Chenheli Hua <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Warning Gemini encountered an error creating the review. You can try again by commenting |
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Chenheli Hua <[email protected]>
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Chenheli Hua <[email protected]>
…ndency. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Chenheli Hua <[email protected]>
Signed-off-by: Chenheli Hua <[email protected]>
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Chenheli Hua <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the contribution!
Signed-off-by: Chenheli Hua <[email protected]>
Signed-off-by: Chenheli Hua <[email protected]>
Signed-off-by: Chenheli Hua <[email protected]> Signed-off-by: Duncan Moss <[email protected]>
Signed-off-by: Chenheli Hua <[email protected]>
Signed-off-by: Chenheli Hua <[email protected]> Signed-off-by: Xiao Yu <[email protected]>
Signed-off-by: Chenheli Hua <[email protected]>
Signed-off-by: Chenheli Hua <[email protected]>
Signed-off-by: Chenheli Hua <[email protected]>
Signed-off-by: Chenheli Hua <[email protected]>
Summary:
vLLM currently lacks ability to generate CPU profiles, and the existing torch profile does not cover frontend/core processes. @WoosukKwon previously suggested profiling CPU by setting
VLLM_ENABLE_V1_MULTIPROCESSING=0
but we still lack the ability to profile when processes are split.For AsyncLLM class, which runs multimodal processor, the CPU work can be slow and worth profiling. Here we added support to also profile CPU.
For simplicity, I'm reusing the env var
VLLM_TORCH_PROFILER_DIR
for CPU traces as well, so both CPU & GPU profiles are consolidated into the same directory.Test Plan:
Both CPU (AsyncLLM) & GPU (workers) profiles are produced:

Example profile with image inputs:

AsyncLLM:
Reviewers:
@youkaichao @WoosukKwon
Subscribers:
Tasks:
Tags:
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.Purpose
Test Plan
Test Result
(Optional) Documentation Update