[Bugfix] Fix cuda event usage with CPU model runner #23643

bigPYJ1151 · 2025-08-26T09:42:23Z

Purpose

Add _EventPlaceholder to workaround CUDA event usage with CPUModelRunner
Replace GPU tensor in CpuGpuBuffer.

Test Plan

CI tests

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: jiang1.li <[email protected]>

gemini-code-assist

Code Review

This pull request introduces a fix for using CUDA events with the CPU model runner by introducing a _torch_cuda_wrapper to monkey-patch torch.cuda.Event. The changes look good overall, but there is a critical issue in the implementation of _torch_cuda_wrapper that could lead to a crash in environments without CUDA support. I've provided a suggestion to make it more robust. The other changes, including using self.pin_memory and refactoring tensor handling with CpuGpuBuffer, are solid improvements.

vllm/v1/worker/cpu_model_runner.py

Signed-off-by: jiang1.li <[email protected]> Signed-off-by: tc-mb <[email protected]>

Signed-off-by: jiang1.li <[email protected]>

Signed-off-by: jiang1.li <[email protected]> Signed-off-by: Xiao Yu <[email protected]>

Signed-off-by: jiang1.li <[email protected]>

fix

1a56aa3

Signed-off-by: jiang1.li <[email protected]>

bigPYJ1151 requested review from WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac and alexm-redhat as code owners August 26, 2025 09:42

mergify bot added the v1 label Aug 26, 2025

gemini-code-assist bot reviewed Aug 26, 2025

View reviewed changes

vllm/v1/worker/cpu_model_runner.py Show resolved Hide resolved

DarkLight1337 approved these changes Aug 26, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) August 26, 2025 13:39

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 26, 2025

DarkLight1337 merged commit 9b01870 into vllm-project:main Aug 26, 2025
48 checks passed

tc-mb pushed a commit to tc-mb/vllm that referenced this pull request Aug 27, 2025

[Bugfix] Fix cuda event usage with CPU model runner (vllm-project#23643)

52203c5

Signed-off-by: jiang1.li <[email protected]> Signed-off-by: tc-mb <[email protected]>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

[Bugfix] Fix cuda event usage with CPU model runner (vllm-project#23643)

400300c

Signed-off-by: jiang1.li <[email protected]>

xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025

[Bugfix] Fix cuda event usage with CPU model runner (vllm-project#23643)

ad789b4

Signed-off-by: jiang1.li <[email protected]> Signed-off-by: Xiao Yu <[email protected]>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

[Bugfix] Fix cuda event usage with CPU model runner (vllm-project#23643)

9baf4c0

Signed-off-by: jiang1.li <[email protected]>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Sep 3, 2025

[Bugfix] Fix cuda event usage with CPU model runner (vllm-project#23643)

5b793ff

Signed-off-by: jiang1.li <[email protected]>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Bugfix] Fix cuda event usage with CPU model runner (vllm-project#23643)

324d06a

Signed-off-by: jiang1.li <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Fix cuda event usage with CPU model runner #23643

[Bugfix] Fix cuda event usage with CPU model runner #23643

Uh oh!

bigPYJ1151 commented Aug 26, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Bugfix] Fix cuda event usage with CPU model runner #23643

[Bugfix] Fix cuda event usage with CPU model runner #23643

Uh oh!

Conversation

bigPYJ1151 commented Aug 26, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bigPYJ1151 commented Aug 26, 2025 •

edited by github-actions bot

Loading