Skip to content

Commit eeb3446

Browse files
authored
[doc] Update README.md
1 parent 1d19d0a commit eeb3446

File tree

1 file changed

+14
-6
lines changed

1 file changed

+14
-6
lines changed

README.md

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -98,11 +98,12 @@ T-MAC achieves comparable 2-bit mpGEMM performance compared to CUDA GPU on Jetso
9898

9999
### Requirements
100100

101-
- Python (3.8 recommended)
101+
- Python (3.8 required for TVM)
102102
- virtualenv
103103
- cmake>=3.22
104104

105-
### OSX (Apple Silicon)
105+
<details>
106+
<summary><h3>OSX (Apple Silicon)</h3></summary>
106107

107108
First, install `cmake`, `zstd` (dependency of llvm) and `libomp` (dependency of tvm). Homebrew is recommended:
108109

@@ -123,7 +124,9 @@ source build/t-mac-envs.sh
123124

124125
The command will download clang+llvm and build tvm from source. So it might take a bit of time.
125126

126-
### Ubuntu (aarch64/x86_64)
127+
</details>
128+
<details>
129+
<summary><h3>Ubuntu (aarch64/x86_64)</h3></summary>
127130

128131
Install cmake>=3.22 from [Official Page](https://cmake.org/download/).
129132

@@ -144,7 +147,9 @@ source build/t-mac-envs.sh
144147

145148
The command will download clang+llvm and build tvm from source. So it might take a bit of time.
146149

147-
### Windows (x86_64)
150+
</details>
151+
<details>
152+
<summary><h3>Windows (x86_64)</h3></summary>
148153

149154
Due to lack of stable clang+llvm prebuilt on Windows, Conda + Visual Studio is recommended to install dependencies.
150155

@@ -184,7 +189,9 @@ $env:PYTHONPATH = "$pwd\3rdparty\tvm\python"
184189
pip install . -v # or pip install -e . -v
185190
```
186191

187-
### Windows (ARM64)
192+
</details>
193+
<details>
194+
<summary><h3>Windows (ARM64)</h3></summary>
188195

189196
> The following process could be more complicated. However, if your deployment scenerio doesn't require a native build, you can use WSL/docker and follow the Ubuntu guide.
190197
@@ -240,6 +247,8 @@ pip install wmi # To detect the native ARM64 CPU within x86_64 python
240247
pip install . -v # or pip install -e . -v
241248
```
242249

250+
</details>
251+
243252
### Verification
244253

245254
After that, you can verify the installation through: `python -c "import t_mac; print(t_mac.__version__); from tvm.contrib.clang import find_clang; print(find_clang())"`.
@@ -317,7 +326,6 @@ Our method exhibits several notable characteristics:
317326

318327
1. T-MAC shows a linear scaling ratio of FLOPs and inference latency relative to the number of bits. This contrasts with traditional convert-based methods, which fail to achieve additional speedup when reducing from 4 bits to lower bits.
319328
2. T-MAC inherently supports bit-wise computation for int1/2/3/4, eliminating the need for dequantization. Furthermore, it accommodates all types of activations (e.g., fp8, fp16, int8) using fast table lookup and add instructions, bypassing the need for poorly supported fused-multiply-add instructions.
320-
3. T-MAC holds the potential to realize performance gains across all processing units (PUs).
321329

322330
## Cite
323331
If you find this repository useful, please use the following BibTeX entry for citation.

0 commit comments

Comments
 (0)