BackendBench is an evaluation suite for testing how well LLMs and humans can write PyTorch backends. It lets developers add custom kernels in an organized directory structure and dynamically override PyTorch's core operators at runtime resulting in a fully functional PyTorch backend you can pip install and use with existing models, no modeling code changes required.
Features:
- Comprehensive edge case correctness testing via PyTorch's OpInfo and FACTO test suites
- Performance benchmarks using real tensor shapes from popular Hugging Face models
- Clean path to upstream your kernels to PyTorch (if it passes our tests, it's likely correct enough to merge)
Many kernel optimization efforts struggle with correctness. Our approach ensures your kernels are production-ready by meeting PyTorch's own standards. You can learn about correcntess in our launch blog and launch video
pip install .
- Create operator directories:
python -m BackendBench.scripts.setup_operator_directories
-
Implement kernels in each directory you'll see an empty op implementation. Please get your LLM to fill it out!
-
Test your implementations:
# smoke test to make sure everything is in check
python BackendBench/scripts/main.py --suite smoke --backend aten
# OpInfo correctness tests
python BackendBench/scripts/main.py --suite opinfo --backend directory
# TorchBench performance tests
python BackendBench/scripts/main.py --suite torchbench --backend directory
See BackendBench Example for a practical demonstration of how to use BackendBench for model convergence testing.
Source code is made available under a BSD 3 license