-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Implement allowPartitionRemapping feature for segment partition metadata management #16776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Implement allowPartitionRemapping feature for segment partition metadata management #16776
Conversation
ef371ae
to
ac032b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements the allowPartitionRemapping feature for segment partition metadata management in Apache Pinot. The feature enables mapping segments with higher partition counts to tables with lower partition counts using modulo arithmetic, facilitating scenarios where Kafka partitions need scaling while maintaining fewer logical partitions in Pinot.
Key changes:
- Added
allowPartitionRemapping
field toColumnPartitionConfig
class - Modified
SegmentPartitionMetadataManager
to support partition ID remapping logic - Added comprehensive test coverage with 3 new test methods validating remapping functionality
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
ColumnPartitionConfig.java |
Added allowPartitionRemapping boolean field and corresponding getter method |
SegmentPartitionMetadataManager.java |
Added remapping logic with modulo-based partition assignment and validation |
BrokerRoutingManager.java |
Updated constructor call to pass the new allowPartitionRemapping parameter |
SegmentPartitionMetadataManagerTest.java |
Added 3 comprehensive test methods covering valid remapping, invalid cases, and disabled remapping scenarios |
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
Outdated
Show resolved
Hide resolved
...va/org/apache/pinot/broker/routing/segmentpartition/SegmentPartitionMetadataManagerTest.java
Show resolved
Hide resolved
ac032b4
to
e86abb4
Compare
e86abb4
to
bb72456
Compare
… allowPartitionRemapping flag - Add testPartitionIdRemappingLogic() to test modulo-based remapping when allowPartitionRemapping=true * Tests 8 Kafka partitions → 4 Pinot partitions remapping using _allowPartitionRemapping flag * Validates segments with IDs 0,4 map to partition 0; 1,5 map to partition 1; etc. * Verifies fully replicated server tracking with remapped partitions - Add testPartitionIdRemappingInvalidCases() to test invalid remapping scenarios * Tests non-divisible partition counts (8 partitions → 3 partitions) with _allowPartitionRemapping=true * Validates segments are correctly marked as invalid when 8 % 3 ≠ 0 - Add testPartitionIdRemappingDisabled() to test behavior when _allowPartitionRemapping=false * Tests that segments with mismatched partition counts are marked invalid when flag is disabled * Validates only segments with exact partition count matches are accepted * Ensures _allowPartitionRemapping=false enforces strict partition count matching These tests comprehensively validate the _allowPartitionRemapping flag behavior in SegmentPartitionMetadataManager, covering both enabled (with modulo remapping) and disabled (strict matching) scenarios. All tests pass and maintain compatibility with existing functionality.
bb72456
to
885bcd9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements the allowPartitionRemapping
feature for segment partition metadata management in Apache Pinot. The feature enables remapping of higher Kafka partition IDs to lower Pinot partition numbers using modulo operation, which is useful when scaling Kafka partitions while maintaining fewer logical partitions in Pinot.
Key Changes:
- Added
allowPartitionRemapping
configuration field to enable/disable partition remapping - Implemented modulo-based partition ID remapping logic (e.g., 8 Kafka partitions → 4 Pinot partitions)
- Enhanced segment partition validation to support both strict matching and flexible remapping modes
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
File | Description |
---|---|
ColumnPartitionConfig.java | Added allowPartitionRemapping field and constructor overloads to support the new configuration |
SegmentPartitionMetadataManager.java | Implemented core partition remapping logic with modulo operation and validation |
BrokerRoutingManager.java | Integrated allowPartitionRemapping flag from table configuration into manager instantiation |
SegmentPartitionMetadataManagerTest.java | Added comprehensive test coverage for remapping functionality and edge cases |
pinot-spi/src/main/java/org/apache/pinot/spi/config/table/ColumnPartitionConfig.java
Show resolved
Hide resolved
...n/java/org/apache/pinot/broker/routing/segmentpartition/SegmentPartitionMetadataManager.java
Show resolved
Hide resolved
...va/org/apache/pinot/broker/routing/segmentpartition/SegmentPartitionMetadataManagerTest.java
Show resolved
Hide resolved
...va/org/apache/pinot/broker/routing/segmentpartition/SegmentPartitionMetadataManagerTest.java
Show resolved
Hide resolved
- Add @JsonInclude(JsonInclude.Include.NON_DEFAULT) to isAllowPartitionRemapping() - This prevents serializing allowPartitionRemapping when it's false (default value) - Fixes SegmentPartitionTest.testSegmentPartitionConfig which expected clean JSON without default fields - Maintains backward compatibility for JSON serialization/deserialization Fixes test failure: expected clean JSON without allowPartitionRemapping field when value is false
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #16776 +/- ##
============================================
+ Coverage 63.42% 63.46% +0.03%
+ Complexity 1400 1399 -1
============================================
Files 3054 3054
Lines 178766 178779 +13
Branches 27399 27403 +4
============================================
+ Hits 113378 113456 +78
+ Misses 56656 56607 -49
+ Partials 8732 8716 -16
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Background
The use case is I have a fairly large table, due to some real world event, I need to scale the ingestion, so in general the way is to double the Kafka partitions, however after the event, I want to shrink the kafka thing back.
So one of the best practice is to double the kafka partition count but keep pinot side partitioning. So old partition 0 will be halved.
This can avoid heavy data rebalance and re-ingestion. ( For a table with 100+TB it's almost impossible to re-ingestion)
Since this PR: #11476 ensures all the consuming/realtime segments partition mapping follows the kafka partition way. The the key thing here is to ensure the SegmentPartitionManager able to produce the desired partitionId.
This PR implements the allowPartitionRemapping feature for segment partition metadata management in Apache Pinot. This feature enables remapping of higher Kafka partition IDs to lower Pinot partition numbers using modulo operation, which is useful when scaling Kafka partitions while maintaining fewer logical partitions in Pinot.
Feature Implementation
Core Changes:
ColumnPartitionConfig.java - Added allowPartitionRemapping field and getter method
SegmentPartitionMetadataManager.java - Implemented partition remapping logic
BrokerRoutingManager.java - Integrated allowPartitionRemapping flag
Remapping Logic:
When allowPartitionRemapping=true: Supports modulo-based remapping (e.g., 8 Kafka partitions to 4 Pinot partitions)
When allowPartitionRemapping=false: Enforces exact partition count matching
Test Coverage
Added comprehensive test suite with 3 new test methods:
Use Case
This feature addresses the common scenario where users need to: