-
Notifications
You must be signed in to change notification settings - Fork 30.4k
Benchmarking v2 GH workflows #40716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Benchmarking v2 GH workflows #40716
Changes from all commits
37be20f
d0231bf
09f1dc5
e1a6229
03b3609
52393e3
78ff33d
57e3cda
1deb38e
059d740
e6c45b6
a6a2924
e72be0e
fdc4301
5bcba35
6bb52ed
4bf4a81
02bf83a
8fb8463
904cab0
0834e28
3416289
6ef3209
16bee68
f5151a4
7df4f45
f0701d7
738e07e
00a4e1f
b8b7f5f
e2b5c4e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
name: Benchmark v2 Framework | ||
|
||
on: | ||
workflow_call: | ||
inputs: | ||
runner: | ||
description: 'GH Actions runner group to use' | ||
required: true | ||
type: string | ||
commit_sha: | ||
description: 'Commit SHA to benchmark' | ||
required: false | ||
type: string | ||
default: '' | ||
upload_to_hub: | ||
description: 'Enable/disable uploading results to a HuggingFace Dataset' | ||
required: false | ||
type: string | ||
default: 'false' | ||
run_id: | ||
description: 'Custom run ID for organizing results (auto-generated if not provided)' | ||
required: false | ||
type: string | ||
default: '' | ||
benchmark_repo_id: | ||
description: 'HuggingFace Dataset to upload results to (e.g., "org/benchmark-results")' | ||
required: false | ||
type: string | ||
default: '' | ||
|
||
env: | ||
HF_HOME: /mnt/cache | ||
TRANSFORMERS_IS_CI: yes | ||
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access. | ||
# This token is created under the bot `hf-transformers-bot`. | ||
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }} | ||
|
||
jobs: | ||
benchmark-v2: | ||
name: Benchmark v2 | ||
runs-on: ${{ inputs.runner }} | ||
if: | | ||
(github.event_name == 'pull_request' && contains( github.event.pull_request.labels.*.name, 'run-benchmark')) || | ||
(github.event_name == 'schedule') | ||
container: | ||
image: huggingface/transformers-pytorch-gpu | ||
options: --gpus all --privileged --ipc host --shm-size "16gb" | ||
steps: | ||
- name: Get repo | ||
uses: actions/checkout@v4 | ||
with: | ||
ref: ${{ inputs.commit_sha || github.sha }} | ||
|
||
- name: Install benchmark dependencies | ||
run: | | ||
python3 -m pip install -r benchmark_v2/requirements.txt | ||
Comment on lines
+54
to
+56
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we would like to have a docker with all of these IMO! only "re install" the latest updates |
||
|
||
- name: Reinstall transformers in edit mode | ||
run: | | ||
python3 -m pip uninstall -y transformers | ||
python3 -m pip install -e ".[torch]" | ||
|
||
- name: Show installed libraries and their versions | ||
run: | | ||
python3 -m pip list | ||
python3 -c "import torch; print(f'PyTorch version: {torch.__version__}')" | ||
python3 -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')" | ||
python3 -c "import torch; print(f'CUDA device count: {torch.cuda.device_count()}')" || true | ||
nvidia-smi || true | ||
|
||
- name: Run benchmark v2 | ||
working-directory: benchmark_v2 | ||
run: | | ||
echo "Running benchmarks" | ||
python3 run_benchmarks.py \ | ||
--commit-id '${{ inputs.commit_sha || github.sha }}' \ | ||
--upload-to-hub '${{ inputs.upload_to_hub || false}}' \ | ||
--run-id '${{ inputs.run_id }}' \ | ||
--benchmark-repo-id '${{ inputs.benchmark_repo_id}}' \ | ||
--log-level INFO | ||
env: | ||
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
name: Benchmark v2 Scheduled Runner - A10 Single-GPU | ||
|
||
on: | ||
schedule: | ||
# Run daily at 16:30 UTC | ||
- cron: "30 16 * * *" | ||
pull_request: | ||
types: [ opened, labeled, reopened, synchronize ] | ||
|
||
jobs: | ||
benchmark-v2-default: | ||
name: Benchmark v2 - Default Models | ||
uses: ./.github/workflows/benchmark_v2.yml | ||
with: | ||
runner: aws-g5-4xlarge-cache-use1-public-80 | ||
commit_sha: ${{ github.sha }} | ||
upload_to_hub: true | ||
run_id: ${{ github.run_id }} | ||
benchmark_repo_id: hf-internal-testing/transformers-daily-benchmarks | ||
secrets: inherit |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
name: Benchmark v2 Scheduled Runner - MI325 Single-GPU | ||
|
||
on: | ||
schedule: | ||
# Run daily at 16:30 UTC | ||
- cron: "30 16 * * *" | ||
pull_request: | ||
types: [ opened, labeled, reopened, synchronize ] | ||
|
||
jobs: | ||
benchmark-v2-default: | ||
name: Benchmark v2 - Default Models | ||
uses: ./.github/workflows/benchmark_v2.yml | ||
with: | ||
runner: amd-mi325-ci-1gpu | ||
commit_sha: ${{ github.sha }} | ||
upload_to_hub: true | ||
run_id: ${{ github.run_id }} | ||
benchmark_repo_id: hf-internal-testing/transformers-daily-benchmarks | ||
secrets: inherit |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,6 +21,36 @@ python run_benchmarks.py \ | |
--num-tokens-to-generate 200 | ||
``` | ||
|
||
### Uploading Results to HuggingFace Dataset | ||
|
||
You can automatically upload benchmark results to a HuggingFace Dataset for tracking and analysis: | ||
|
||
```bash | ||
# Upload to a public dataset with auto-generated run ID | ||
python run_benchmarks.py --upload-to-hf username/benchmark-results | ||
|
||
# Upload with a custom run ID for easy identification | ||
python run_benchmarks.py --upload-to-hf username/benchmark-results --run-id experiment_v1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we generate and print the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
``` | ||
|
||
**Dataset Directory Structure:** | ||
``` | ||
dataset_name/ | ||
├── 2025-01-15/ | ||
│ ├── runs/ # Non-scheduled runs (manual, PR, etc.) | ||
│ │ └── 123-1245151651/ # GitHub run number and ID | ||
│ │ └── benchmark_results/ | ||
│ │ ├── benchmark_summary_20250115_143022.json | ||
│ │ └── model-name/ | ||
│ │ └── model-name_benchmark_20250115_143022.json | ||
│ └── benchmark_results_abc123de/ # Scheduled runs (daily CI) | ||
│ ├── benchmark_summary_20250115_143022.json | ||
│ └── model-name/ | ||
│ └── model-name_benchmark_20250115_143022.json | ||
└── 2025-01-16/ | ||
└── ... | ||
``` | ||
|
||
### Running Specific Benchmarks | ||
|
||
```bash | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this redundant with
upload_to_hub
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the other has the wrong description, it's a boolean to toggle the upload on/off. Fixing it.