Skip to content

Conversation

Bobronium
Copy link
Collaborator

@Bobronium Bobronium commented Sep 11, 2025

Follow up on #14420 and #14352

The rate limiter was incorrectly rejecting requests when the limit was met, but not exceeded. The check in is_cache_list_over_limit was int(counter_value) + 1 > current_limit, which caused the first request to be rejected if the limit was 1.

This PR removes the + 1, changing the logic to int(counter_value) > current_limit. The check now correctly allows requests up to the specified parallel limit.

It also adds several tests to ensure correct behavior when handling parallel/sequential requests from multiple users.

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/ directory
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
pytest tests/test_end_users.py::test_aaaend_user_specific_region tests/local_testing/test_pass_through_endpoints.py -k 'rpm or specific_region' -n 6 -vv

image

Type

🆕 New Feature
🚄 Infrastructure
✅ Test

Changes

Copy link

vercel bot commented Sep 11, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
litellm Ready Ready Preview Comment Sep 12, 2025 2:33pm

Bobronium and others added 6 commits September 12, 2025 16:05
(cherry picked from commit f3fa45cf8fbd5f5cce2f45a7312776d5005fb08e)
(cherry picked from commit 5b680bb)
The rate limiter was incorrectly rejecting requests when the limit was met, but not exceeded. The check in `is_cache_list_over_limit` was `int(counter_value) + 1 > current_limit`, which caused the first request to be rejected if the limit was 1.

This commit removes the `+ 1`, changing the logic to `int(counter_value) > current_limit`. The check now correctly allows requests up to the specified parallel limit.
@Bobronium Bobronium force-pushed the litellm_/performance/proxy-parallel-request-handler-v3 branch from d6d36c8 to e62f0ec Compare September 12, 2025 14:07
@Bobronium
Copy link
Collaborator Author

Bobronium commented Sep 12, 2025

Should be ready to merge, please review

@ishaan-jaff ishaan-jaff merged commit f4318bc into main Sep 13, 2025
42 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants