[GGUF] Revert GGUF WA for GPU #2392

sammysun0711 · 2025-06-30T05:54:14Z

Details:
Revert GGUF Reader WA for OV GPU plugin: #2110

Ticket:
CVS-169891

…ed by PR30365

…by PR30548

…, which was fixed by PR30698 & PR30941

Wovchena · 2025-06-30T10:07:29Z

The same test that was skipped for Mac fails for Linux. Maybe resolving it for Linux fixes it for Mac as well...

sammysun0711 · 2025-06-30T13:17:05Z

The same test that was skipped for Mac fails for Linux. Maybe resolving it for Linux fixes it for Mac as well...

Although same test test_full_gguf_pipeline failed on Mac and Linux. But the issue are different.

Skipped Mac test failed with segment fault with 2nd test model

openvino.genai/tests/python_tests/data/models.py

Lines 80 to 81 in 6a620fb

"gguf_model_id": "Qwen/Qwen2.5-0.5B-Instruct-GGUF",

"gguf_filename": "qwen2.5-0.5b-instruct-q4_0.gguf"

Current Linux test case failed with 3rd test model due to output mismatch w/ HF model

openvino.genai/tests/python_tests/data/models.py

Lines 84 to 85 in 6a620fb

    
           "gguf_model_id": "sammysun0711/tiny-random-deepseek-distill-qwen-gguf", 
        
           "gguf_filename": "tiny-random-deepseek-distill-qwen_q8_0.gguf"

Will take a look for proper fix the test case.

…el in CI

Copilot

Pull Request Overview

This PR reverts previous workarounds (WA) for GPU plugin issues in the GGUF model handling code. The changes remove GPU-specific fixes that were implemented to address accuracy and compilation issues on MTL/LNL GPU platforms.

Removes shared embedding parameter and logic from language model creation
Reverts dynamic quantization group size from 0 (disabled) back to 64 (enabled)
Removes zero point array modification workaround for Q4_0 weights

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
src/cpp/src/gguf_utils/gguf_modeling.cpp	Removes shared_embedding parameter, reverts dynamic quantization settings, and cleans up GPU-related comments
src/cpp/src/gguf_utils/building_blocks.hpp	Updates make_lm_head function signature to remove shared_embedding parameter
src/cpp/src/gguf_utils/building_blocks.cpp	Removes shared embedding logic and zero point array modification workaround

Copilot · 2025-07-18T06:31:20Z

src/cpp/src/gguf_utils/building_blocks.cpp

-            w_f32 = make_weights_subgraph(key, consts, lm_qtype, false, -1);
-        } else {
-            w_f32 = embeddings_node;
+    if (consts.count(key + ".weight")) {


[nitpick] The logic structure has changed after removing the shared_embedding condition, but the original fallback logic (using embeddings_node when key + ".weight" doesn't exist) is preserved. Consider adding a comment to clarify this fallback behavior for future maintainers.

src/cpp/src/gguf_utils/gguf_modeling.cpp

…model test

sammysun0711 · 2025-07-21T00:23:22Z

@Wovchena, updated GGUF CI tests and all test passed, local GPU test passed, can we merge this PR?

Xiake Sun added 3 commits June 30, 2025 09:48

Remove GGUF Q4_0 zp WA for compilation error on GPU plugin, which fix…

101e29a

…ed by PR30365

Remove dynamic quantization WA on MTL for runtime issue, which fixed …

135a49e

…by PR30548

Remove unused shared embedding WA due to accuracy issue on GPU plugin…

f946936

…, which was fixed by PR30698 & PR30941

github-actions bot added the category: GGUF GGUF file reader label Jun 30, 2025

sammysun0711 changed the title ~~[GGUF] Revert GGUF GPU WA~~ [GGUF] Revert GGUF WA for GPU Jun 30, 2025

sammysun0711 requested a review from Copilot June 30, 2025 06:30

This comment was marked as outdated.

Sign in to view

Disable dynamic quantization due to accuracy sensitive dummy gguf mod…

6c9fdcc

…el in CI

Wovchena approved these changes Jul 3, 2025

View reviewed changes

Xiake Sun added 3 commits July 11, 2025 11:02

Merge branch 'master' into remove_gguf_gpu_wa

15b91f9

Merge branch 'master' into remove_gguf_gpu_wa

0534d00

set default group size 64 for dynamic quantiation

4bc8ff1

Wovchena requested a review from Copilot July 18, 2025 06:30

Copilot AI reviewed Jul 18, 2025

View reviewed changes

Wovchena reviewed Jul 18, 2025

View reviewed changes

src/cpp/src/gguf_utils/gguf_modeling.cpp Outdated Show resolved Hide resolved

Add model specific dynamic_quantization_group_size value for CI gguf …

77b0ace

…model test

github-actions bot added category: LLM LLM pipeline (stateful, static) no-match-files labels Jul 18, 2025

Merge branch 'master' into remove_gguf_gpu_wa

a5939b6

Wovchena approved these changes Jul 21, 2025

View reviewed changes

Wovchena added this pull request to the merge queue Jul 21, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 21, 2025

sammysun0711 added this pull request to the merge queue Jul 21, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 21, 2025

Wovchena added this pull request to the merge queue Jul 23, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 24, 2025

Wovchena added this pull request to the merge queue Jul 24, 2025

Merged via the queue into openvinotoolkit:master with commit ee9fbb9 Jul 24, 2025
82 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GGUF] Revert GGUF WA for GPU #2392

[GGUF] Revert GGUF WA for GPU #2392

Uh oh!

sammysun0711 commented Jun 30, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Wovchena commented Jun 30, 2025

Uh oh!

sammysun0711 commented Jun 30, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 18, 2025

Uh oh!

Uh oh!

sammysun0711 commented Jul 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[GGUF] Revert GGUF WA for GPU #2392

[GGUF] Revert GGUF WA for GPU #2392

Uh oh!

Conversation

sammysun0711 commented Jun 30, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Wovchena commented Jun 30, 2025

Uh oh!

sammysun0711 commented Jun 30, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sammysun0711 commented Jul 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!