[Doc] Add T-MAN news and update release link

kaleid-liner · kaleid-liner · commit cc24b3b1c0b4 · 2025-05-28T03:21:09.000+08:00
diff --git a/README.md b/README.md
@@ -12,6 +12,8 @@
 
 ## News
 
+- 10/10/2024 🚀🚀: The idea of T-MAC extends its capabilities to NPU! For more information, check out the [t-man README](t-man/README.md) and try BitNet/Qwen3/Llama3 with the demo app!
+
 - 10/21/2024 🎉🎉: [BitNet](https://github.com/microsoft/BitNet), powered by T-MAC, is open-sourced.
 
 - 10/10/2024 🚀🚀: By updating and rebasing our llama.cpp version, T-MAC now support more models (e.g., qwen2) and the end-to-end performance is further improved by 10~15%! Try qwen2 using [the Official GPTQ model](https://huggingface.co/Qwen/Qwen2-7B-Instruct-GPTQ-Int4).
@@ -24,10 +26,6 @@
 
 - 07/27/2024 ✨: We've noted that T-MAC is even faster than the NPU in token generation speed on the latest Snapdragon X Elite chipset! Check [Compared to NPU](#compared-to-npu) for more details.
 
-- 07/23/2024 🚀🚀: We've enabled the execution of any 2-bit quantized Llama model in GPTQ format via T-MAC! Test it using the pretrained models released by [EfficientQAT](https://github.com/OpenGVLab/EfficientQAT).
-
-- 07/22/2024 🚀🚀: We've added native deployment support for Windows on ARM. T-MAC demonstrates a substantial 5x speedup on the Surface Laptop 7.
-
 ## Introduction
 
 T-MAC is a kernel library to directly support mixed-precision matrix multiplication (int1/2/3/4 x int8/fp16/fp32) without the need for dequantization by utilizing lookup tables. T-MAC aims to boost low-bit LLM inference on CPUs. T-MAC already offers support for various low-bit models, including W4A16 from GPTQ/gguf, W2A16 from [BitDistiller](https://github.com/DD-DuDa/BitDistiller)/[EfficientQAT](https://github.com/OpenGVLab/EfficientQAT) and W1(.58)A8 from [BitNet](https://huggingface.co/1bitLLM/bitnet_b1_58-3B) on OSX/Linux/Windows equipped with ARM/Intel CPUs.
diff --git a/t-man/README.md b/t-man/README.md
@@ -17,7 +17,7 @@ By achieving up to 50 t/s token generation for [BitNet-2B-4T](https://huggingfac
 
 ### Use the Android App
 
-- Get the apk from the [release page]().
+- Get the apk from the [release page](https://github.com/microsoft/T-MAC/releases).
 - Select a model (e.g., Qwen3-8B) in the settings. The model files will be downloaded automatically (requires internet access).
 - Load the model.
 - Enjoy your conversation!