Fix weights loading for Apertus #24100

nathanrchn · 2025-09-02T13:02:52Z

Purpose

This PR fixes the weight loading mechanism for the Apertus models. The model uses a custom XIELU activation function that needs custom parameters for each layers. These parameters need to be stored alongside the model's weights. So we need to load the buffers too.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

github-actions · 2025-09-02T13:03:02Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

gemini-code-assist

Code Review

This pull request correctly addresses the need to load buffers for the XIELU activation function in the Apertus model. The implementation adds the necessary buffers to the params_dict for weight loading. However, the method used to identify these buffers by their name suffix is slightly brittle and could lead to issues in the future. I've suggested a more robust approach that directly targets XIELU modules, which will be more resilient to future changes in the model architecture.

gemini-code-assist · 2025-09-02T13:04:47Z

vllm/model_executor/models/apertus.py

+        for name, buffer in self.named_buffers():
+            if name.endswith(".beta") or name.endswith(".eps"):
+                params_dict[name] = buffer


The current approach of identifying buffers to load by checking if their names end with .beta or .eps is a bit brittle. It could unintentionally match buffers from other parts of the model if they happen to use the same suffix. This could lead to incorrect weights being loaded or errors if shapes don't match.

A more robust approach would be to explicitly iterate over XIELU modules and add their buffers to params_dict. This ensures that you are only targeting the intended buffers and makes the code more resilient to future changes.

Suggested change

for name, buffer in self.named_buffers():

if name.endswith(".beta") or name.endswith(".eps"):

params_dict[name] = buffer

for module_name, module in self.named_modules():

if isinstance(module, XIELU):

for buffer_name, buffer in module.named_buffers(recurse=False):

full_name = f'{module_name}.{buffer_name}'

params_dict[full_name] = buffer

Signed-off-by: Nathan Ranchin <[email protected]>

…e buffer Signed-off-by: Nathan Ranchin <[email protected]>

malte-aws · 2025-09-02T16:43:30Z

Hi,
What is the ETA for getting this merged?

DarkLight1337 · 2025-09-02T16:45:21Z

Retrying the failing test

Signed-off-by: Nathan Ranchin <[email protected]>

* 'main' of https://github.com/845473182/vllm: (457 commits) [BugFix] Fix routed_scaling_factor double mul for dots1 and glm4 MoE models (vllm-project#24132) [Misc] Add check for dual_chunk_attention (vllm-project#24070) [Doc]: fix typos in Python comments (vllm-project#24115) [Doc]: fix typos in Python comments (vllm-project#24093) [Compile] Fix Compile Warning for `w4a8_mm_entry.cu` (vllm-project#23660) fix some typos (vllm-project#24071) [V1] Wrapper which plumbs request-level logits processors into vLLM batch-level logits processing (vllm-project#23656) Upgrade xgrammar to 0.1.23 (vllm-project#22988) Update release pipeline post PyTorch 2.8.0 update (vllm-project#24073) [XPU] Fix the bug of LoRA logits on the XPU platform (vllm-project#24081) [CI/Build] Disable SiluMul NVFP4 quant fusion tests (vllm-project#24121) [Bug] R1 Accuracy: Fix `routed_scaling_factor` Double Mul Issue (vllm-project#24119) [AMD][Kernel][Bugfix] Cast offsets tensor bn to tl.int64 to avoid GPU segfault (vllm-project#23692) [CI] Enable all hf transformers baselines in test_hybrid (vllm-project#23936) [Log] Only Print Profiler Results on Rank 0 (vllm-project#23370) Fix weights loading for Apertus (vllm-project#24100) [Metrics] Deprecate TPOT in favor of ITL (vllm-project#24110) [Bugfix] Fix packed_factor missing attribute error (vllm-project#23902) Run ruff format on a few files. (vllm-project#24075) [Bugfix] Fix transform_config parsing in Compressed Tensors (vllm-project#23945) ...

Signed-off-by: Nathan Ranchin <[email protected]> Signed-off-by: 子悬 <[email protected]>

Signed-off-by: Nathan Ranchin <[email protected]> Signed-off-by: Matthew Bonanni <[email protected]>

Signed-off-by: Nathan Ranchin <[email protected]>

Signed-off-by: Nathan Ranchin <[email protected]> Signed-off-by: Shiyan Deng <[email protected]>

Signed-off-by: Nathan Ranchin <[email protected]>

Signed-off-by: Nathan Ranchin <[email protected]> Signed-off-by: LopezCastroRoberto <[email protected]>

gemini-code-assist bot reviewed Sep 2, 2025

View reviewed changes

nathanrchn added 2 commits September 2, 2025 15:08

fix the weights loading for the apertus model

3abf386

Signed-off-by: Nathan Ranchin <[email protected]>

filter for beta and eps only, otherwise there is also the sincos cach…

7bab46c

…e buffer Signed-off-by: Nathan Ranchin <[email protected]>

EduardDurech mentioned this pull request Sep 2, 2025

[MODEL] Apertus and XIELU #23068

Merged

nathanrchn force-pushed the fix/weights_loading_apertus branch from 797e773 to 7bab46c Compare September 2, 2025 13:08

DarkLight1337 approved these changes Sep 2, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) September 2, 2025 13:19

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 2, 2025

malte-aws mentioned this pull request Sep 2, 2025

feat: Swiss ai apertus aws-samples/sagemaker-genai-hosting-examples#112

Merged

DarkLight1337 merged commit 598bd74 into vllm-project:main Sep 2, 2025
50 checks passed

nathanrchn deleted the fix/weights_loading_apertus branch September 2, 2025 19:43

rzabarazesh pushed a commit to rzabarazesh/vllm that referenced this pull request Sep 2, 2025

Fix weights loading for Apertus (vllm-project#24100)

3f7e69c

Signed-off-by: Nathan Ranchin <[email protected]>

akaihaoshuai pushed a commit to akaihaoshuai/vllm that referenced this pull request Sep 3, 2025

Fix weights loading for Apertus (vllm-project#24100)

7625a80

Signed-off-by: Nathan Ranchin <[email protected]> Signed-off-by: 子悬 <[email protected]>

MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Sep 3, 2025

Fix weights loading for Apertus (vllm-project#24100)

f7568b1

Signed-off-by: Nathan Ranchin <[email protected]> Signed-off-by: Matthew Bonanni <[email protected]>

MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Sep 3, 2025

Fix weights loading for Apertus (vllm-project#24100)

3301968

Signed-off-by: Nathan Ranchin <[email protected]>

EduardDurech mentioned this pull request Sep 3, 2025

Apertus - Error while loading model huggingface/transformers#40650

Closed

4 tasks

842974287 pushed a commit to 842974287/vllm that referenced this pull request Sep 3, 2025

Fix weights loading for Apertus (vllm-project#24100)

f09a7fb

Signed-off-by: Nathan Ranchin <[email protected]> Signed-off-by: Shiyan Deng <[email protected]>

jejbq mentioned this pull request Sep 4, 2025

Add Apertus models support, latest vLLM w/ transformers and hf-overrides architectures gpustack/gpustack#2822

Open

eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025

Fix weights loading for Apertus (vllm-project#24100)

88419e1

Signed-off-by: Nathan Ranchin <[email protected]>

LopezCastroRoberto pushed a commit to LopezCastroRoberto/vllm that referenced this pull request Sep 11, 2025

Fix weights loading for Apertus (vllm-project#24100)

8bce54e

Signed-off-by: Nathan Ranchin <[email protected]> Signed-off-by: LopezCastroRoberto <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix weights loading for Apertus #24100

Fix weights loading for Apertus #24100

Uh oh!

nathanrchn commented Sep 2, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Sep 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Sep 2, 2025

Uh oh!

malte-aws commented Sep 2, 2025

Uh oh!

DarkLight1337 commented Sep 2, 2025

Uh oh!

Uh oh!

Uh oh!

-        for name, buffer in self.named_buffers():
-            if name.endswith(".beta") or name.endswith(".eps"):
-                params_dict[name] = buffer
+        for module_name, module in self.named_modules():
+            if isinstance(module, XIELU):
+                for buffer_name, buffer in module.named_buffers(recurse=False):
+                    full_name = f'{module_name}.{buffer_name}'
+                    params_dict[full_name] = buffer

Uh oh!

Fix weights loading for Apertus #24100

Fix weights loading for Apertus #24100

Uh oh!

Conversation

nathanrchn commented Sep 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Sep 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

malte-aws commented Sep 2, 2025

Uh oh!

DarkLight1337 commented Sep 2, 2025

Uh oh!

Uh oh!

Uh oh!

nathanrchn commented Sep 2, 2025 •

edited by github-actions bot

Loading