[Performance] Use _PROXY_MaxParallelRequestsHandler_v3 by default again #14450

Bobronium · 2025-09-11T14:14:51Z

Follow up on #14420 and #14352

The rate limiter was incorrectly rejecting requests when the limit was met, but not exceeded. The check in is_cache_list_over_limit was int(counter_value) + 1 > current_limit, which caused the first request to be rejected if the limit was 1.

This PR removes the + 1, changing the logic to int(counter_value) > current_limit. The check now correctly allows requests up to the specified parallel limit.

It also adds several tests to ensure correct behavior when handling parallel/sequential requests from multiple users.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/ directory
I have added a screenshot of my new test passing locally
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem

pytest tests/test_end_users.py::test_aaaend_user_specific_region tests/local_testing/test_pass_through_endpoints.py -k 'rpm or specific_region' -n 6 -vv

Type

🆕 New Feature
🚄 Infrastructure
✅ Test

Changes

vercel · 2025-09-11T14:14:55Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
litellm	Ready	Preview	Comment	Sep 12, 2025 2:33pm

(cherry picked from commit f3fa45cf8fbd5f5cce2f45a7312776d5005fb08e) (cherry picked from commit 5b680bb)

The rate limiter was incorrectly rejecting requests when the limit was met, but not exceeded. The check in `is_cache_list_over_limit` was `int(counter_value) + 1 > current_limit`, which caused the first request to be rejected if the limit was 1. This commit removes the `+ 1`, changing the logic to `int(counter_value) > current_limit`. The check now correctly allows requests up to the specified parallel limit.

Bobronium · 2025-09-12T20:54:28Z

Should be ready to merge, please review

vercel bot deployed to Preview September 11, 2025 14:36 View deployment

Bobronium force-pushed the litellm_/performance/proxy-parallel-request-handler-v3 branch from 602dcba to 96bdf20 Compare September 11, 2025 15:55

vercel bot deployed to Preview September 11, 2025 15:57 View deployment

vercel bot deployed to Preview September 11, 2025 16:27 View deployment

vercel bot deployed to Preview September 11, 2025 20:04 View deployment

Bobronium force-pushed the litellm_/performance/proxy-parallel-request-handler-v3 branch from 5240e27 to 432fdee Compare September 12, 2025 01:05

vercel bot deployed to Preview September 12, 2025 01:06 View deployment

Bobronium marked this pull request as ready for review September 12, 2025 01:13

Bobronium force-pushed the litellm_/performance/proxy-parallel-request-handler-v3 branch from 432fdee to 0be84c9 Compare September 12, 2025 02:45

vercel bot deployed to Preview September 12, 2025 02:47 View deployment

Bobronium force-pushed the litellm_/performance/proxy-parallel-request-handler-v3 branch from 0be84c9 to d6d36c8 Compare September 12, 2025 03:22

vercel bot deployed to Preview September 12, 2025 03:24 View deployment

Bobronium and others added 6 commits September 12, 2025 16:05

Use _PROXY_MaxParallelRequestsHandler_v3 by default (#14352)

80bc25d

(cherry picked from commit f3fa45cf8fbd5f5cce2f45a7312776d5005fb08e) (cherry picked from commit 5b680bb)

Use random api_key for parallel requests test

8e7707e

Test actual parallel requests

48778e0

Ensure rate limiting works correctly for multiple users

6279c2f

Add sequential rate-limit test

e62f0ec

Bobronium force-pushed the litellm_/performance/proxy-parallel-request-handler-v3 branch from d6d36c8 to e62f0ec Compare September 12, 2025 14:07

vercel bot deployed to Preview September 12, 2025 14:09 View deployment

Revert random key usage

3a9b66c

vercel bot deployed to Preview September 12, 2025 14:33 View deployment

Bobronium requested review from ishaan-jaff and krrishdholakia and removed request for krrishdholakia September 12, 2025 20:53

ishaan-jaff merged commit f4318bc into main Sep 13, 2025
42 of 47 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Performance] Use _PROXY_MaxParallelRequestsHandler_v3 by default again #14450

[Performance] Use _PROXY_MaxParallelRequestsHandler_v3 by default again #14450

Uh oh!

Bobronium commented Sep 11, 2025 •

edited

Loading

Uh oh!

vercel bot commented Sep 11, 2025 •

edited

Loading

Uh oh!

Bobronium commented Sep 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Performance] Use _PROXY_MaxParallelRequestsHandler_v3 by default again #14450

[Performance] Use _PROXY_MaxParallelRequestsHandler_v3 by default again #14450

Uh oh!

Conversation

Bobronium commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Bobronium commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Bobronium commented Sep 11, 2025 •

edited

Loading

vercel bot commented Sep 11, 2025 •

edited

Loading

Bobronium commented Sep 12, 2025 •

edited

Loading