Skip to content

Conversation

tianmu-li
Copy link
Contributor

@tianmu-li tianmu-li force-pushed the test_async branch 3 times, most recently from 778161a to 68609db Compare September 11, 2025 00:35
@tianmu-li tianmu-li changed the title [WIP] Fully overlap model execution Fully overlap model execution Sep 11, 2025
@tianmu-li tianmu-li marked this pull request as ready for review September 11, 2025 03:20
@tianmu-li
Copy link
Contributor Author

Half overlapping. There is still one sync point at https://github.com/vllm-project/vllm-gaudi/pull/134/files#diff-5ffdc7547fbc10ff45e9791caaef30c306a59a0e3f7c9515569f342baed8c0e2R116, but I can't find a safe way to remove it.

Signed-off-by: Tianmu Li <[email protected]>

Incorporate commit by Marcin

Signed-off-by: Tianmu Li <[email protected]>

Pre-commit fix

Signed-off-by: Tianmu Li <[email protected]>

Remove unneeded change

Signed-off-by: Tianmu Li <[email protected]>

WIP

Signed-off-by: Tianmu Li <[email protected]>

pre-commit fix

Signed-off-by: Tianmu Li <[email protected]>
Signed-off-by: Tianmu Li <[email protected]>
Signed-off-by: Tianmu Li <[email protected]>
Signed-off-by: Tianmu Li <[email protected]>
Signed-off-by: Tianmu Li <[email protected]>
Signed-off-by: Tianmu Li <[email protected]>
@xuechendi
Copy link
Collaborator

/run-gaudi-tests

Copy link
Collaborator

@xuechendi xuechendi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had offline discussion with Tianmu, codes looks good.
Perf and profiling will be updated into ticket

@xuechendi
Copy link
Collaborator

Failed on spec decode when async_scheduler is not on, cancel the run

@tianmu-li
Copy link
Contributor Author

Fixed the issue by preventing concatenation of decode_sampled_token_ids and prefill_sampled_token_ids when not using async_scheduling.

@mgawarkiewicz-intel
Copy link
Collaborator

/run-gaudi-tests

@xuechendi
Copy link
Collaborator

/run-gaudi-tests

@xuechendi
Copy link
Collaborator

/run-gaudi-tests

@xuechendi xuechendi merged commit dca6719 into vllm-project:main Sep 15, 2025
8 checks passed
kdamaszk pushed a commit to kdamaszk/vllm-gaudi that referenced this pull request Sep 18, 2025
Dependent on vllm-project/vllm#23569

---------

Signed-off-by: Tianmu Li <[email protected]>
Co-authored-by: Chendi.Xue <[email protected]>
mswiniarsk pushed a commit that referenced this pull request Sep 18, 2025
Porting changes from main branch: #134 i #184 

Author: Tianmu Li <[email protected]>

---------

Signed-off-by: Tianmu Li <[email protected]>
Co-authored-by: Tianmu Li <[email protected]>
Co-authored-by: Chendi.Xue <[email protected]>
slokesha pushed a commit to slokesha/vllm-gaudi that referenced this pull request Sep 24, 2025
Dependent on vllm-project/vllm#23569

---------

Signed-off-by: Tianmu Li <[email protected]>
Co-authored-by: Chendi.Xue <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants