Skip to content

Commit 1ad1363

Browse files
committed
[doc] Update docs
1 parent d9d521b commit 1ad1363

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -261,6 +261,8 @@ huggingface-cli download ChenMnZ/Llama-3-8b-instruct-EfficientQAT-w2g128-GPTQ --
261261
python tools/run_pipeline.py -o ${model_dir} -m llama-3-8b-2bit
262262
```
263263

264+
> Use `-p` or `-s` argument to select the steps you want to run. And use `-u` argument to use our prebuilt kernels for ARM.
265+
264266
An example output:
265267

266268
```
@@ -288,8 +290,6 @@ Running STEP.6: Run inference
288290
Check logs/2024-07-15-17-10-11.log for inference output
289291
```
290292

291-
Check [e2e.md](docs/e2e.md) for the purpose of each step.
292-
293293
## Upcoming Features
294294

295295
We will soon:

docs/e2e.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# End-2-End Inference Through llama.cpp
1+
# End-2-End Inference Through llama.cpp (legacy)
22

33
> The following guide use BitNet-3B. We will add instructions how to use GPTQ/GGUF/BitDistiller models or even your customized models.
44

0 commit comments

Comments
 (0)