Skip to content

Conversation

quic-sanising
Copy link
Contributor

@quic-sanising quic-sanising commented Sep 4, 2025

📢 Expanded On-Device Sampling Support in QEfficient

Excited to share that On-Device Sampling—previously available only for LlamaForCausalLM—is now supported across a broader set of architectures! This enhancement brings faster, more efficient inference directly to the QAIC device.

✅ Newly Supported Architectures:

  1. FalconForCausalLM
  2. GemmaForCausalLM
  3. GPT2LMHeadModel
  4. GPTJForCausalLM
  5. GraniteForCausalLM
  6. GraniteMoeForCausalLM
  7. LlamaForCausalLM (existing)
  8. MptForCausalLM
  9. Phi3ForCausalLM
  10. Qwen2ForCausalLM

⚠️ Architectures Still Pending Support:

  1. GPTBigCodeForCausalLM
  2. InternVLChatModel
  3. MistralForCausalLM
  4. MixtralForCausalLM
  5. LlamaSwiftKVForCausalLM
  6. Grok1ModelForCausalLM

We’re actively working to extend support to these models. Contributions, feedback, and testing from the community are always welcome to help accelerate this effort!

quic-sanising and others added 30 commits June 18, 2025 13:38
Signed-off-by: quic-sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
Signed-off-by: sanising <[email protected]>
@quic-sanising quic-sanising changed the title Extend On Device Sampling Support to more Causal Language Models Extend On-Device Sampling Support to more Causal Language Models Sep 4, 2025
@quic-sanising
Copy link
Contributor Author

Depends on PR #463.

@quic-sanising quic-sanising changed the base branch from main to ods-unit-tests September 4, 2025 20:26
@quic-sanising quic-sanising changed the base branch from ods-unit-tests to main September 4, 2025 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants