|
| 1 | +(new-model-tests)= |
| 2 | + |
| 3 | +# Writing Unit Tests |
| 4 | + |
| 5 | +This page explains how to write unit tests to verify the implementation of your model. |
| 6 | + |
| 7 | +## Required Tests |
| 8 | + |
| 9 | +These tests are necessary to get your PR merged into vLLM library. |
| 10 | +Without them, the CI for your PR will fail. |
| 11 | + |
| 12 | +### Model loading |
| 13 | + |
| 14 | +Include an example HuggingFace repository for your model in <gh-file:tests/models/registry.py>. |
| 15 | +This enables a unit test that loads dummy weights to ensure that the model can be initialized in vLLM. |
| 16 | + |
| 17 | +```{important} |
| 18 | +The list of models in each section should be maintained in alphabetical order. |
| 19 | +``` |
| 20 | + |
| 21 | +```{tip} |
| 22 | +If your model requires a development version of HF Transformers, you can set |
| 23 | +`min_transformers_version` to skip the test in CI until the model is released. |
| 24 | +``` |
| 25 | + |
| 26 | +## Optional Tests |
| 27 | + |
| 28 | +These tests are optional to get your PR merged into vLLM library. |
| 29 | +Passing these tests provides more confidence that your implementation is correct, and helps avoid future regressions. |
| 30 | + |
| 31 | +### Model correctness |
| 32 | + |
| 33 | +These tests compare the model outputs of vLLM against [HF Transformers](https://github.com/huggingface/transformers). You can add new tests under the subdirectories of <gh-dir:tests/models>. |
| 34 | + |
| 35 | +#### Generative models |
| 36 | + |
| 37 | +For [generative models](#generative-models), there are two levels of correctness tests, as defined in <gh-file:tests/models/utils.py>: |
| 38 | + |
| 39 | +- Exact correctness (`check_outputs_equal`): The text outputted by vLLM should exactly match the text outputted by HF. |
| 40 | +- Logprobs similarity (`check_logprobs_close`): The logprobs outputted by vLLM should be in the top-k logprobs outputted by HF, and vice versa. |
| 41 | + |
| 42 | +#### Pooling models |
| 43 | + |
| 44 | +For [pooling models](#pooling-models), we simply check the cosine similarity, as defined in <gh-file:tests/models/embedding/utils.py>. |
| 45 | + |
| 46 | +(mm-processing-tests)= |
| 47 | + |
| 48 | +### Multi-modal processing |
| 49 | + |
| 50 | +#### Common tests |
| 51 | + |
| 52 | +Adding your model to <gh-file:tests/models/multimodal/processing/test_common.py> verifies that the following input combinations result in the same outputs: |
| 53 | + |
| 54 | +- Text + multi-modal data |
| 55 | +- Tokens + multi-modal data |
| 56 | +- Text + cached multi-modal data |
| 57 | +- Tokens + cached multi-modal data |
| 58 | + |
| 59 | +#### Model-specific tests |
| 60 | + |
| 61 | +You can add a new file under <gh-dir:tests/models/multimodal/processing> to run tests that only apply to your model. |
| 62 | + |
| 63 | +For example, if the HF processor for your model accepts user-specified keyword arguments, you can verify that the keyword arguments are being applied correctly, such as in <gh-file:tests/models/multimodal/processing/test_phi3v.py>. |
0 commit comments