fix biachuan-7b tp #598

Sanster · 2023-07-27T08:10:25Z

The main modifications are in the "load_weights" function.

Before:

After:

LiVincent-Zhang · 2023-07-27T08:58:22Z

Is the same reason for baichuan-13b? #530

Sanster · 2023-07-27T09:07:14Z

Is the same reason for baichuan-13b? #530

Yes. I have tested it on both baichuan13b and 7b, and it can output normal output under tp.

LiVincent-Zhang · 2023-07-27T09:21:16Z

Is the same reason for baichuan-13b? #530

Yes. I have tested it on both baichuan13b and 7b, and it can output normal output under tp.

Can I use this PR directly on 13B？

zhuohan123

Thank you for your contribution! Can you use our official formatting script and remove other additional format changes?

zhuohan123 · 2023-07-30T03:58:19Z

vllm/model_executor/models/baichuan.py

Is this part the only part that actually changes the code logic? Can you remove other format-only modifications and use format.sh script provided by us to re-format the code? Thanks!

Hi, I have already modified the content of the PR and removed the invalid format part.

zhuohan123

LGTM! Thank you for your contribution!

Co-authored-by: wq.chu <[email protected]>

…ct#598) ### What this PR does / why we need it? Deepseek v3 now adopt vanilla chunked prefill on MLA part which is ineffcient for computing but necessary for chunked prefill. Since PR vllm-project/vllm-ascend#543 bring v0 scheduler into vllm-ascend, we can now adopt torch_npu._npu_flash_attention inside the mla backend for more performance boost. Also there are some redundant computation inside the rope, which is also removed. This PR should bring some performance gain for deepseek eager mode inference. --------- Signed-off-by: ganyi <[email protected]>

This was referenced Jul 27, 2023

ModuleNotFoundError: No module named 'transformers_modules' with API serving using baichuan-7b #572

Closed

Assistance Needed: Issues with Distributed Deployment in Baichuan-13b-Chat Server Implementation #513

Closed

zhuohan123 requested changes Jul 30, 2023

View reviewed changes

fix biachuan-7b tp

aeb2d9e

Sanster force-pushed the fix_baichuan_7b_tp branch from 356793c to aeb2d9e Compare August 1, 2023 06:25

zhuohan123 approved these changes Aug 1, 2023

View reviewed changes

zhuohan123 merged commit d4c7755 into vllm-project:main Aug 1, 2023

This was referenced Aug 7, 2023

Unable to run baichuan13b on 2 GPUs #566

Closed

百川baichuan-chat-13B 多卡推理 #593

Closed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

fix biachuan-7b tp (vllm-project#598)

842576f

Co-authored-by: wq.chu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix biachuan-7b tp #598

fix biachuan-7b tp #598

Uh oh!

Sanster commented Jul 27, 2023

Uh oh!

LiVincent-Zhang commented Jul 27, 2023

Uh oh!

Sanster commented Jul 27, 2023

Uh oh!

LiVincent-Zhang commented Jul 27, 2023

Uh oh!

zhuohan123 left a comment

Uh oh!

zhuohan123 Jul 30, 2023

Uh oh!

Sanster Aug 1, 2023

Uh oh!

zhuohan123 left a comment

Uh oh!

Uh oh!

Uh oh!

fix biachuan-7b tp #598

fix biachuan-7b tp #598

Uh oh!

Conversation

Sanster commented Jul 27, 2023

Uh oh!

LiVincent-Zhang commented Jul 27, 2023

Uh oh!

Sanster commented Jul 27, 2023

Uh oh!

LiVincent-Zhang commented Jul 27, 2023

Uh oh!

zhuohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

zhuohan123 Jul 30, 2023

Choose a reason for hiding this comment

Uh oh!

Sanster Aug 1, 2023

Choose a reason for hiding this comment

Uh oh!

zhuohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!