Skip to content

Conversation

dongluw
Copy link
Contributor

@dongluw dongluw commented Aug 11, 2025

Essential Elements of an Effective PR Description Checklist

  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

Issue: #22648

Support Command-A-Vision https://huggingface.co/CohereLabs/command-a-vision-07-2025/blob/main/config.json

Test Plan

python examples/offline_inference/vision_language_multi_image.py --model-type command_a_vision

--------------------------------------------------
The first image shows a **Mallard duck** swimming in calm blue water. The duck has a vibrant green head, a yellow bill, and a body with brown, white, and black feathers. Its reflection is visible in the water, and the surface has gentle ripples.

The second image features a **lion** sitting in a field of tall, golden-brown grass. The lion has a thick, dark mane and a focused expression, looking slightly to the side. The background is softly blurred, emphasizing the lion's majestic presence in its natural habitat.
--------------------------------------------------
python examples/offline_inference/vision_language.py --model-type command_a_vision

--------------------------------------------------
The image captures a stunning view of the Tokyo Tower, a prominent landmark in Tokyo, Japan, framed by the delicate pink blossoms of cherry trees. The tower, with its white and orange structure, stands tall against a clear blue sky, creating a striking contrast. The cherry blossoms, in full bloom, dominate the foreground
--------------------------------------------------
The image captures a stunning view of the Tokyo Tower, a prominent landmark in Tokyo, Japan, framed by cherry blossoms in full bloom. The tower, with its distinctive white and orange color scheme, stands tall against a clear blue sky. The cherry blossom trees, or sakura, are in the foreground, their delicate
--------------------------------------------------
The image captures a stunning view of the Tokyo Tower, a prominent landmark in Tokyo, Japan, framed by the delicate blossoms of cherry trees in full bloom. The cherry blossoms, known as *sakura* in Japanese, are a quintessential symbol of spring and are celebrated for their fleeting beauty. The contrast between the soft
--------------------------------------------------
The image captures a stunning view of the Tokyo Tower, a prominent landmark in Tokyo, Japan, framed by the delicate blossoms of cherry trees. The tower, with its distinctive white and orange color scheme, stands tall against a clear blue sky. The cherry blossoms, in full bloom, are a vibrant pink and dominate the
--------------------------------------------------

online chat

vllm serve CohereLabs/command-a-vision-07-2025 --disable-log-requests  --tensor-parallel-size 4 --max_model_len 32000 --max-num-seqs 32

python examples/online_serving/openai_chat_completion_client_for_multimodal.py --chat-type multi-image


Chat completion output: The images feature two animals: a mallard duck and an African lion. 

1. **Mallard Duck**: The first image shows a mallard duck swimming in water. Mallards are one of the most recognizable and widespread duck species, known for their iridescent green heads (in males), white collars, and

Test Result

(Optional) Documentation Update

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added documentation Improvements or additions to documentation new-model Requests to new models labels Aug 11, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the Command-A-Vision model. The changes include a new model implementation file, along with additions to example scripts and model registries. The core logic seems sound and follows existing patterns in the codebase. However, I've identified a critical bug in the new model implementation related to an incorrect function call that will cause a runtime error. Additionally, there's a high-severity issue where a value is hardcoded, which compromises the code's generality and maintainability. Addressing these points will improve the robustness and quality of the new model support.

@dongluw dongluw marked this pull request as draft August 11, 2025 17:10
Signed-off-by: donglu <[email protected]>
@dongluw dongluw marked this pull request as ready for review August 11, 2025 17:53
@dongluw dongluw requested a review from hmellor as a code owner August 11, 2025 17:53
Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

cc @hmellor can you measure the perf difference compared to Transformers backend? In the future we may be able to switch fully to Transformers backend if the performance becomes on par with the custom implementation.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) August 11, 2025 17:55
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 11, 2025
@hmellor
Copy link
Member

hmellor commented Aug 11, 2025

can you measure the perf difference compared to Transformers backend?

Sure, testing now

@hmellor
Copy link
Member

hmellor commented Aug 11, 2025

Looks like running Command A Vision in the Transformers backend will require #22673 and a change on the Transformers side

@vllm-bot vllm-bot merged commit 9f909b8 into vllm-project:main Aug 12, 2025
39 of 47 checks passed
paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025
diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025
yiliu30 pushed a commit to yiliu30/vllm-fork that referenced this pull request Aug 19, 2025
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025
xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants