You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[eplatero] Add support for exporting and compiling models for SpD (#119)
* rebasing with main. previous local gen_spd_models was broken since it was not picking up latest changes from main. as such, I found common ancestor, picked up latest changes from main, and made new commit to contain all unique changes for pr 119
Signed-off-by: eplatero <[email protected]>
* add decode_seq_len to non-continuous batching case
Signed-off-by: eplatero <[email protected]>
* mirror test_causal_lm_models.py from main
Signed-off-by: eplatero <[email protected]>
* add more to the explanation of the model changes
Signed-off-by: eplatero <[email protected]>
* lint fixing
Signed-off-by: eplatero <[email protected]>
* alphabetical order imports on pytorch_transforms.py
Signed-off-by: eplatero <[email protected]>
* add init to spd directory
Signed-off-by: eplatero <[email protected]>
* replace modifying seq_len by letting user define proper config
Signed-off-by: eplatero <[email protected]>
* resolving 1st round comments from Onkar and made fix on gather implementation
Signed-off-by: eplatero <[email protected]>
* removing old unit tests
Signed-off-by: eplatero <[email protected]>
* * Added way to make num_logits_to_keep dynamic in ONNX and removed need to regenerate ONNX for different values of num_logits_to_keep only qpc is recompiled, * ran formatter , * reorganized pytorch transforms
Signed-off-by: Onkar Chougule <[email protected]>
* changed interface to be similar to CB
Signed-off-by: Onkar Chougule <[email protected]>
* made unit tests work with array approach
Signed-off-by: eplatero <[email protected]>
* for TLM, made specialization return 1 logit for prefill and for decode
Signed-off-by: eplatero <[email protected]>
* moved from to method because this flag only has implications for compile stage, not export
Signed-off-by: eplatero <[email protected]>
* fixing qpc directory naming to be backwards compatible
Signed-off-by: eplatero <[email protected]>
* updating docstrings and documentation
Signed-off-by: eplatero <[email protected]>
* revert changes to CLI exportation of onnx and specialization to reflect state in main branch
Signed-off-by: eplatero <[email protected]>
* fixed specializations creation and ran formatter
Signed-off-by: Onkar Chougule <[email protected]>
* add pytorch-level unit test
Signed-off-by: eplatero <[email protected]>
* uncommented non-llama pytorch-level unit test
Signed-off-by: eplatero <[email protected]>
* modified pytorch level unit test and added hf vs ort vs qaic unit test
Signed-off-by: eplatero <[email protected]>
* change llama test model from jackfram to tinyllama to match other tests
Signed-off-by: eplatero <[email protected]>
* fix failing tlm_dlm tests by passing is_tlm correctly in modeling_auto
Signed-off-by: eplatero <[email protected]>
* rm dlm specialization
Signed-off-by: eplatero <[email protected]>
* updated quick_docs
Signed-off-by: eplatero <[email protected]>
* rm tlm dims test since that's already tested and generalize common code in pytorch_transforms
Signed-off-by: eplatero <[email protected]>
* rm flag from non-test definition
Signed-off-by: eplatero <[email protected]>
* rm unnecessary function that is not used
Signed-off-by: eplatero <[email protected]>
* ran formatter and linter
Signed-off-by: Onkar Chougule <[email protected]>
---------
Signed-off-by: eplatero <[email protected]>
Signed-off-by: Onkar Chougule <[email protected]>
Co-authored-by: eplatero <[email protected]>
Co-authored-by: Onkar Chougule <[email protected]>
0 commit comments