-
Notifications
You must be signed in to change notification settings - Fork 286
[GGUF] Serialize Generated OV Model for Faster LLMPipeline Init #2218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
rkazants
merged 58 commits into
openvinotoolkit:master
from
sammysun0711:gguf_model_cache
Jun 13, 2025
Merged
Changes from 6 commits
Commits
Show all changes
58 commits
Select commit
Hold shift + click to select a range
421627a
Serialize generated OV model from GGUF model for faster pipe initiali…
7d1c9de
Add try-catch to handle expecption raise by serialize, continue with …
cb883ba
Minior refactor to handle different gguf model in same directory
67b2fd7
Explicit save model based on ov::cache_dir properties, add time measu…
20c24b4
use ov:save model to compress OV model
14bc5f5
Merge branch 'master' into gguf_model_cache
ff05f51
Merge branch 'master' into gguf_model_cache
63fd0ee
Implict cache generated ov model constructed from gguf
89671e8
apply review comments
f96a014
Merge branch 'gguf_model_cache' of https://github.com/sammysun0711/op…
83db989
Remove unused header file
e7d5552
Add special property ENALBE_SAVE_OV_MODEL to control whether save gen…
20effa2
Merge branch 'master' into gguf_model_cache
538cd86
Merge branch 'master' into gguf_model_cache
5f23a9c
Simplify logic based on #2129 and #2240
11eaf5d
Add documents and update error message
dc6dd2b
Add test case
e7b1b23
Add ov::genai::enable_save_ov_model property
e7cc549
Control GGUF reader related debug info with OPENVINO_LOG_LEVEL
fc2d433
update test case
495c1d4
Merge branch 'master' into gguf_model_cache
9c808ec
Merge branch 'master' into gguf_model_cache
6ac37fb
minior fix for test
80c26b8
Merge branch 'master' into gguf_model_cache
f2c5080
Update src/cpp/src/gguf_utils/gguf_modeling.cpp
c568ddd
Merge branch 'master' into gguf_model_cache
55e9468
Merge branch 'master' into gguf_model_cache
fa6df48
Merge branch 'master' into gguf_model_cache
32b7427
Fix merge conflict
5e73520
Merge branch 'master' into gguf_model_cache
8488486
Fix merge conflict
8fd892d
move save_openvino_model to utils for re-use
1fd26c2
Save generated ov_tokenizer & ov_detokenzier model for re-use
51f87fa
Update test
f0991a3
Fix review comments
2b0b29a
minnor test fix
807d0f9
Update test
5bf0e69
remove unused import
f8c84f5
Move extract_draft_model_from_config, extract_prompt_lookup_from_conf…
030dcde
Test only: pass no properties to tokenizer/detokenizer
617c1dc
Revert "Move extract_draft_model_from_config, extract_prompt_lookup_f…
01c9eb5
Simplify unused properties handling for tokenizer
76466d1
Set enable_save_ov_model as None by default
afa9bbb
Merge branch 'master' into gguf_model_cache
4788fd3
Merge branch 'master' into gguf_model_cache
13ab0df
test enable_save_ov_model=False only
fb4fb17
enable save_ov_model test
3ef08bf
[Debug only] try use macos-13-large to check if core dump cause by li…
a05b435
Merge branch 'master' into gguf_model_cache
cad5068
Revert "[Debug only] try use macos-13-large to check if core dump cau…
c4a8ce1
release unused pipeline with gc to save memory
62448b6
try to further reduce test memory usage
d3147b5
reduce memory usage for test_pipelines_with_gguf_generate
3c232ac
Split separate test for gguf enable_save_ov_model to save memory usage
1283de6
Fix merge conflict
b6a8384
Refactor test
e354e4e
Fix merge conflict
e618f5c
Merge branch 'master' into gguf_model_cache
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.