-
Notifications
You must be signed in to change notification settings - Fork 56
Optimized ONNX Transform via Class Merging and Thread Pooling #546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Optimized ONNX Transform via Class Merging and Thread Pooling #546
Conversation
2df6780
to
a528b29
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. can merge if CI is passing
00fcaf2
to
9b9c41d
Compare
Signed-off-by: abhishek-singh591 <[email protected]>
…added removed comments Signed-off-by: abhishek-singh591 <[email protected]>
…added removed comments Signed-off-by: abhishek-singh591 <[email protected]>
Signed-off-by: abhishek-singh591 <[email protected]>
Signed-off-by: abhishek-singh591 <[email protected]>
Signed-off-by: abhishek-singh591 <[email protected]>
e83ac9e
to
f64b429
Compare
if not apply_clip and not apply_split: | ||
warnings.warn("Both apply_clip and apply_split are False. Skipping transformation.") | ||
return model, False | ||
|
||
external_data_helper.load_external_data_for_model(model, onnx_base_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though combining the transform save time in this case, it also reduces the flexibility we have over multiple transforms. In future if we need to add more transforms the condition would become more complex and if its a new transform would need to load the tensors again. I have added few changes as part #538 of the memory clean and reducing the peak memory usage. Can you check if the same concepts can be used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, its kind off tradeoff between time and flexibility, let me check that will get back to you.
Optimized ONNX Transform via Class Merging and Thread Pooling
This PR follows up on #539 – Optimized ONNX transform class via multithreading.
It merges the FP16 and Split ONNX transform classes into a single implementation to eliminate redundant tensor loading and iteration. Additionally, the transform logic has been refactored to use a thread pool, replacing the previous sequential loop to parallelize tensor operations.
Performance Benchmarks:-