Skip to content

Conversation

ignaciosica
Copy link
Contributor

@ignaciosica ignaciosica commented Jul 18, 2025

Purpose

The purpose for this pr is to fix the cpu installation from source for apple silicon CPUs. Recently, basic serving capabilities and examples stopped working in my local setup (M3 pro). I bisected my way to this pr #14129 which introduced int8 quantization for ARM CPU. The problem lied on the fact that although apple silicon shared some codepath with the rest of the arm cpu in cpu_extension.cmake, it also had some specific configurations that ended up breaking build after the int8 support was introduced; more specifically, apple silicon pathway was not enabling ASIMD_FOUND thus not including quant.cpp (cpu_extension.cmake:284) source.

After a fresh install from source

git clone https://github.com/vllm-project/vllm.git
cd vllm
uv venv --python 3.12 --seed
source .venv/bin/activate
uv pip install -r requirements/cpu.txt
uv pip install -e .

Basic example failed with

python examples/offline_inference/basic/basic.py
> WARNING 07-18 12:13:35 [_custom_ops.py:20] Failed to import from vllm._C with ImportError("dlopen([...]/vllm/vllm/_C.abi3.so, 0x0002): symbol not found in flat namespace '__Z14int8_scaled_mmRN2at6TensorERKS0_S3_S3_S3_RKNSt3__18optionalIS0_EE'")
> [...]
> AttributeError: '_OpNamespace' '_C_cache_ops' object has no attribute 'reshape_and_cache'

In order to fix, this pr enabled ASIMD_FOUND for apple silicon as well. For safety, it checked for support with the following command: sysctl -n hw.optional.neon. As far as I know, all apple silicon, starting from M1 up to M4 generation support this feature, but still decided to gate ASIMD_FOUND on this check in the case the support is dropped for future generations.

After this fix, cpu installation from source started working again.

This pr also enables bf16 support for apple silicon. This feature is gated via the following check hw.optional.arm.FEAT_BF16. Based on this LLVM's commit, bf16 support was introduced for cpu in m2 generation.

Test Plan

Unfortunately I'm not familiar enough with build runs in CI, I would appreciate some guidance for this point.

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added the ci/build label Jul 18, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a build issue for Apple Silicon CPUs by enabling ASIMD support and introducing a check_sysctl function. To improve robustness, I've identified a critical issue regarding argument quoting in the new check_sysctl function that should be addressed.

@DarkLight1337 DarkLight1337 requested a review from mgoin July 19, 2025 09:36
@ignaciosica ignaciosica force-pushed the fix_fresh_source_install branch from 537a376 to 34d30da Compare July 20, 2025 01:18
@ignaciosica
Copy link
Contributor Author

addressed commit signing issue

@@ -70,7 +86,10 @@ endfunction()
is_avx512_disabled(AVX512_DISABLED)

if (MACOSX_FOUND AND CMAKE_SYSTEM_PROCESSOR STREQUAL "arm64")
set(APPLE_SILICON_FOUND TRUE)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed APPLE_SILICON_FOUND as it wasn't used downstream anymore but maybe it's still useful to keep it for future reference/use?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this would be useful to follow the pattern and use in downstream code

@ignaciosica ignaciosica force-pushed the fix_fresh_source_install branch from 34d30da to e25e799 Compare July 24, 2025 23:29
@mgoin mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 24, 2025
Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM I think this is reasonable and shouldn't be disruptive to other backends

@mgoin mgoin added the cpu Related to CPU backends label Jul 25, 2025
@vllm-bot vllm-bot merged commit 5140f54 into vllm-project:main Jul 25, 2025
46 of 48 checks passed
liuyumoye pushed a commit to liuyumoye/vllm that referenced this pull request Jul 31, 2025
wenscarl pushed a commit to wenscarl/vllm that referenced this pull request Aug 4, 2025
x22x22 pushed a commit to x22x22/vllm that referenced this pull request Aug 5, 2025
Pradyun92 pushed a commit to Pradyun92/vllm that referenced this pull request Aug 6, 2025
npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025
jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025
paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025
taneem-ibrahim pushed a commit to taneem-ibrahim/vllm that referenced this pull request Aug 14, 2025
BoyuanFeng pushed a commit to BoyuanFeng/vllm that referenced this pull request Aug 14, 2025
diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025
googlercolin pushed a commit to googlercolin/vllm that referenced this pull request Aug 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/build cpu Related to CPU backends ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants