Skip to content

Conversation

youkaichao
Copy link
Member

@youkaichao youkaichao commented Sep 2, 2025

Purpose

Redo #23960 in an upstream branch so that we can trigger build with custom env vars.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the ci/build label Sep 2, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the release pipeline to use CUDA 12.9, which is necessary for the PyTorch 2.8.0 update. The changes correctly update the CUDA versions and corresponding build identifiers in the Buildkite pipeline configuration. The wheel upload script has also been modified to handle CUDA 12.9 as the new default version for publishing wheel indices. My review includes suggestions to improve the robustness of the wheel upload script to make it more maintainable and prevent potential issues in future releases.

Comment on lines +61 to 72
if [[ $normal_wheel == *"cu126"* ]]; then
# if $normal_wheel matches cu126, do not upload the index.html
echo "Skipping index files for cu126 wheels"
elif [[ $normal_wheel == *"cu128"* ]]; then
# if $normal_wheel matches cu128, do not upload the index.html
echo "Skipping index files for cu128 wheels"
else
# only upload index.html for cu128 wheels (default wheels)
# only upload index.html for cu129 wheels (default wheels) as it
# is available on both x86 and arm64
aws s3 cp index.html "s3://vllm-wheels/$BUILDKITE_COMMIT/vllm/index.html"
aws s3 cp "s3://vllm-wheels/nightly/index.html" "s3://vllm-wheels/$BUILDKITE_COMMIT/index.html"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current if/elif/else structure for determining whether to upload index files is fragile. It assumes that any wheel not matching cu126 or cu128 is the default cu129. This could lead to incorrect behavior if other wheel versions are added to the build pipeline in the future.

Using a case statement would make this logic more explicit and robust. It clearly defines the action for each known wheel type and can provide a warning for unexpected versions, preventing silent failures in the release process.

Suggested change
if [[ $normal_wheel == *"cu126"* ]]; then
# if $normal_wheel matches cu126, do not upload the index.html
echo "Skipping index files for cu126 wheels"
elif [[ $normal_wheel == *"cu128"* ]]; then
# if $normal_wheel matches cu128, do not upload the index.html
echo "Skipping index files for cu128 wheels"
else
# only upload index.html for cu128 wheels (default wheels)
# only upload index.html for cu129 wheels (default wheels) as it
# is available on both x86 and arm64
aws s3 cp index.html "s3://vllm-wheels/$BUILDKITE_COMMIT/vllm/index.html"
aws s3 cp "s3://vllm-wheels/nightly/index.html" "s3://vllm-wheels/$BUILDKITE_COMMIT/index.html"
fi
case "$normal_wheel" in
*cu129*)
# only upload index.html for cu129 wheels (default wheels) as it
# is available on both x86 and arm64
aws s3 cp index.html "s3://vllm-wheels/$BUILDKITE_COMMIT/vllm/index.html"
aws s3 cp "s3://vllm-wheels/nightly/index.html" "s3://vllm-wheels/$BUILDKITE_COMMIT/index.html"
;;
*cu126*|*cu128*)
# if $normal_wheel matches cu126 or cu128, do not upload the index.html
echo "Skipping index files for $normal_wheel wheels"
;;
*)
echo "WARNING: Unhandled wheel version for index upload: $normal_wheel. Not uploading index."
;;
esac

Comment on lines +78 to 88
if [[ $normal_wheel == *"cu126"* ]]; then
# if $normal_wheel matches cu126, do not upload the index.html
echo "Skipping index files for cu126 wheels"
elif [[ $normal_wheel == *"cu128"* ]]; then
# if $normal_wheel matches cu128, do not upload the index.html
echo "Skipping index files for cu128 wheels"
else
# only upload index.html for cu128 wheels (default wheels)
# only upload index.html for cu129 wheels (default wheels) as it
# is available on both x86 and arm64
aws s3 cp index.html "s3://vllm-wheels/nightly/vllm/index.html"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to the block above, this if/elif/else structure is fragile. It implicitly assumes any wheel not matching cu126 or cu128 is the default cu129 for which the nightly index should be updated.

Refactoring to a case statement will make the logic more robust and maintainable, explicitly handling known cases and warning about unknown ones. This is important for the stability of the release pipeline.

Suggested change
if [[ $normal_wheel == *"cu126"* ]]; then
# if $normal_wheel matches cu126, do not upload the index.html
echo "Skipping index files for cu126 wheels"
elif [[ $normal_wheel == *"cu128"* ]]; then
# if $normal_wheel matches cu128, do not upload the index.html
echo "Skipping index files for cu128 wheels"
else
# only upload index.html for cu128 wheels (default wheels)
# only upload index.html for cu129 wheels (default wheels) as it
# is available on both x86 and arm64
aws s3 cp index.html "s3://vllm-wheels/nightly/vllm/index.html"
fi
case "$normal_wheel" in
*cu129*)
# only upload index.html for cu129 wheels (default wheels) as it
# is available on both x86 and arm64
aws s3 cp index.html "s3://vllm-wheels/nightly/vllm/index.html"
;;
*cu126*|*cu128*)
# if $normal_wheel matches cu126 or cu128, do not upload the index.html
echo "Skipping index files for $normal_wheel wheels"
;;
*)
echo "WARNING: Unhandled wheel version for nightly index upload: $normal_wheel. Not uploading index."
;;
esac

@youkaichao youkaichao added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 2, 2025
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
@nWEIdia
Copy link

nWEIdia commented Sep 2, 2025

I can see this PR pushed a new aarch64 image from the 2nd last commit of this branch, great! (See: https://gallery.ecr.aws/q9t5s3a7/vllm-release-repo 43a6f63d0860ef2f545122332d7a377e99010838-aarch64)
Do you know why the top commit did not generate a new aarch64 build?

@youkaichao
Copy link
Member Author

Do you know why the top commit did not generate a new aarch64 build?

because I don't trigger release build for the latest commit.

Copy link

@nWEIdia nWEIdia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thanks!

Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stamped! Retrying on #23960 seems to work for me too to avoid timing out in the release build https://buildkite.com/vllm/release/builds/7828. So both PR(s) are fine I think

@youkaichao youkaichao merged commit 42dc59d into main Sep 3, 2025
18 checks passed
@youkaichao youkaichao deleted the update-release-pipeline-2.8.0-release branch September 3, 2025 02:09
@youkaichao
Copy link
Member Author

Retrying on #23960 seems to work for me too to avoid timing out in the release build

yeah that's because I manually triggered a 10 hour build, now the compilation cache is populated, and later release build can be much faster.

845473182 pushed a commit to 845473182/vllm that referenced this pull request Sep 3, 2025
* 'main' of https://github.com/845473182/vllm: (457 commits)
  [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132)
  [Misc] Add check for dual_chunk_attention (vllm-project#24070)
  [Doc]: fix typos in Python comments (vllm-project#24115)
  [Doc]: fix typos in Python comments (vllm-project#24093)
  [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660)
  fix some typos (vllm-project#24071)
  [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656)
  Upgrade xgrammar to 0.1.23 (vllm-project#22988)
  Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073)
  [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081)
  [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121)
  [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119)
  [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692)
  [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936)
  [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370)
  Fix weights loading for Apertus (vllm-project#24100)
  [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110)
  [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902)
  Run ruff format on a few files. (vllm-project#24075)
  [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945)
  ...
842974287 pushed a commit to 842974287/vllm that referenced this pull request Sep 3, 2025
Signed-off-by: Huy Do <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Co-authored-by: Huy Do <[email protected]>
Signed-off-by: Shiyan Deng <[email protected]>
eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025
LopezCastroRoberto pushed a commit to LopezCastroRoberto/vllm that referenced this pull request Sep 11, 2025
Signed-off-by: Huy Do <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Co-authored-by: Huy Do <[email protected]>
Signed-off-by: LopezCastroRoberto <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/build ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants