Skip to content

Conversation

nathanrchn
Copy link
Contributor

@nathanrchn nathanrchn commented Sep 2, 2025

Purpose

This PR fixes the weight loading mechanism for the Apertus models. The model uses a custom XIELU activation function that needs custom parameters for each layers. These parameters need to be stored alongside the model's weights. So we need to load the buffers too.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link

github-actions bot commented Sep 2, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses the need to load buffers for the XIELU activation function in the Apertus model. The implementation adds the necessary buffers to the params_dict for weight loading. However, the method used to identify these buffers by their name suffix is slightly brittle and could lead to issues in the future. I've suggested a more robust approach that directly targets XIELU modules, which will be more resilient to future changes in the model architecture.

Comment on lines +420 to +422
for name, buffer in self.named_buffers():
if name.endswith(".beta") or name.endswith(".eps"):
params_dict[name] = buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current approach of identifying buffers to load by checking if their names end with .beta or .eps is a bit brittle. It could unintentionally match buffers from other parts of the model if they happen to use the same suffix. This could lead to incorrect weights being loaded or errors if shapes don't match.

A more robust approach would be to explicitly iterate over XIELU modules and add their buffers to params_dict. This ensures that you are only targeting the intended buffers and makes the code more resilient to future changes.

Suggested change
for name, buffer in self.named_buffers():
if name.endswith(".beta") or name.endswith(".eps"):
params_dict[name] = buffer
for module_name, module in self.named_modules():
if isinstance(module, XIELU):
for buffer_name, buffer in module.named_buffers(recurse=False):
full_name = f'{module_name}.{buffer_name}'
params_dict[full_name] = buffer

@nathanrchn nathanrchn force-pushed the fix/weights_loading_apertus branch from 797e773 to 7bab46c Compare September 2, 2025 13:08
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) September 2, 2025 13:19
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 2, 2025
@malte-aws
Copy link

Hi,
What is the ETA for getting this merged?

@DarkLight1337
Copy link
Member

Retrying the failing test

@DarkLight1337 DarkLight1337 merged commit 598bd74 into vllm-project:main Sep 2, 2025
50 checks passed
@nathanrchn nathanrchn deleted the fix/weights_loading_apertus branch September 2, 2025 19:43
rzabarazesh pushed a commit to rzabarazesh/vllm that referenced this pull request Sep 2, 2025
845473182 pushed a commit to 845473182/vllm that referenced this pull request Sep 3, 2025
* 'main' of https://github.com/845473182/vllm: (457 commits)
  [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132)
  [Misc] Add check for dual_chunk_attention (vllm-project#24070)
  [Doc]: fix typos in Python comments (vllm-project#24115)
  [Doc]: fix typos in Python comments (vllm-project#24093)
  [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660)
  fix some typos (vllm-project#24071)
  [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656)
  Upgrade xgrammar to 0.1.23 (vllm-project#22988)
  Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073)
  [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081)
  [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121)
  [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119)
  [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692)
  [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936)
  [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370)
  Fix weights loading for Apertus (vllm-project#24100)
  [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110)
  [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902)
  Run ruff format on a few files. (vllm-project#24075)
  [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945)
  ...
akaihaoshuai pushed a commit to akaihaoshuai/vllm that referenced this pull request Sep 3, 2025
Signed-off-by: Nathan Ranchin <[email protected]>
Signed-off-by: 子悬 <[email protected]>
MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Sep 3, 2025
Signed-off-by: Nathan Ranchin <[email protected]>
Signed-off-by: Matthew Bonanni <[email protected]>
MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Sep 3, 2025
842974287 pushed a commit to 842974287/vllm that referenced this pull request Sep 3, 2025
Signed-off-by: Nathan Ranchin <[email protected]>
Signed-off-by: Shiyan Deng <[email protected]>
eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025
LopezCastroRoberto pushed a commit to LopezCastroRoberto/vllm that referenced this pull request Sep 11, 2025
Signed-off-by: Nathan Ranchin <[email protected]>
Signed-off-by: LopezCastroRoberto <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants