Skip to content

Conversation

yashmayya
Copy link
Contributor

@yashmayya yashmayya commented Oct 11, 2024

  • Fixes Aggregation filter is being pushed down in a way it breaks SQL semantics #12647 and Query having just "one aggregation with filter" produces no result if aggregation' filter does not match any record #13982.
  • Both the above issues have a good description of the problem but the TLDR is that for group by queries with only filtered aggregations, we currently only compute the groups based on the rows that match the query's main filter and the rows that match at least one of the aggregation filters.
  • This is good for performance, since in cases where the group by query doesn't have a main filter and there are only filtered aggregations (i.e., no non-filtered aggregations), indexes can be used and a full scan is avoided. However, this leads to results that are not SQL compliant, since all groups (computed over the results of the main query filter) are expected to be returned even for queries with only filtered aggregations.
  • For the single-stage query engine, this actually appears to be a regression from Fixes filter aggs perf #10356, prior to which all groups should have been returned by default.
  • For the multi-stage query engine's group by executor, this appears to always have been an issue.
  • This patch takes a different approach for both the query engines - in the single-stage engine, the default behavior remains the same, but the standard SQL behavior of returning all groups can be enabled using a new query option. In the multi-stage engine, however, all groups will be computed by default. The reasoning is that Pinot has historically chosen performance over standard SQL behavior for the v1 engine in such cases where the behavioral difference might not be too significant. For the multi-stage query engine, however, we have been moving towards standard SQL behavior regardless of potential performance impact.
  • This patch fixes both the query engines to follow standard SQL behavior. However, a query option is also provided to override this behavior for performance reasons - i.e., if users are okay with skipping computation of empty groups (groups not matching any aggregation filter), then this query option can be set. This can provide a large performance boost if the main query filter has high selectivity.
  • For the v1 engine's executor, the fix is to simply put an additional AggregationInfo with an empty AggregationFunction array and the main query filter. This ensures that the GroupByExecutor will compute all the groups (from the result of applying the main query filter) but no additional unnecessary aggregation will be done.
  • For the v2 engine's executor, the fix is similar. Before this patch, all the group keys are only computed if there is at least one non-filtered aggregation. For group by queries with only filtered aggregations, groups were computed over the rows matching the aggregation filters. Now, all the groups will be computed even for group by queries with only filtered aggregations.

@yashmayya yashmayya added bugfix multi-stage Related to the multi-stage query engine SQL Compliance query labels Oct 11, 2024
@yashmayya
Copy link
Contributor Author

The PR description explains the rationale behind choosing different default behavior for the v1 and v2 query engines here. However, this currently means that there's no way to disable this behavior for the multi-stage query engine to improve query performance (and get non-standard behavior as a tradeoff). If we want to instead keep the behavior of the two query engines consistent, we have two options:

  • By default, always compute all groups for group by queries with only filtered aggregations. Add an option / performance hint to disable this behavior and improve query performance at the cost of non-standard behavior. The disadvantage of this option is that performance will suffer by default.
  • Keep the older behavior by default. Add an option to enable this standard SQL behavior of returning all groups at the cost of performance. The disadvantage of this is that the multi-stage query engine behavior in this case won't be SQL compliant by default (which is what we've been evolving it towards).

@yashmayya yashmayya marked this pull request as ready for review October 11, 2024 13:03
@yashmayya yashmayya requested a review from gortiz October 11, 2024 13:13
@codecov-commenter
Copy link

codecov-commenter commented Oct 11, 2024

Codecov Report

Attention: Patch coverage is 60.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 63.81%. Comparing base (59551e4) to head (eb934a6).
Report is 1198 commits behind head on master.

Files with missing lines Patch % Lines
...t/core/operator/query/FilteredGroupByOperator.java 50.00% 1 Missing and 1 partial ⚠️
...ry/runtime/operator/MultistageGroupByExecutor.java 33.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14211      +/-   ##
============================================
+ Coverage     61.75%   63.81%   +2.05%     
- Complexity      207     1536    +1329     
============================================
  Files          2436     2623     +187     
  Lines        133233   144424   +11191     
  Branches      20636    22099    +1463     
============================================
+ Hits          82274    92157    +9883     
- Misses        44911    45462     +551     
- Partials       6048     6805     +757     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.76% <60.00%> (+2.05%) ⬆️
java-21 63.69% <60.00%> (+2.06%) ⬆️
skip-bytebuffers-false 63.80% <60.00%> (+2.05%) ⬆️
skip-bytebuffers-true 63.66% <60.00%> (+35.93%) ⬆️
temurin 63.81% <60.00%> (+2.05%) ⬆️
unittests 63.80% <60.00%> (+2.05%) ⬆️
unittests1 55.54% <60.00%> (+8.65%) ⬆️
unittests2 34.31% <0.00%> (+6.58%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines 244 to 248
if (intKeys == null) {
// _groupIdGenerator should still have all the groups even if there are only filtered aggregates for SQL
// compliant results.
generateGroupByKeys(block);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be semantically the same to always initialize intKeys to generateGroupByKeys(block); in line 209?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this is actually left over from my earlier choice of making it optional for the multi-stage engine too. I can move it back if we decide we're okay with keeping this the default behavior for the multi-stage engine.

Comment on lines 1019 to 1020
// Write the query in a way where the aggregation will not be pushed to the leaf stage, so that we can test the
// MultistageGroupByExecutor
Copy link
Contributor

@gortiz gortiz Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think you can use the is_skip_leaf_stage_group_by hint to do so

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice, I wasn't aware of the existence of this hint.

@@ -3729,4 +3729,39 @@ public void testSkipIndexes(boolean useMultiStageQueryEngine)
updateTableConfig(tableConfig);
reloadAllSegments(TEST_UPDATED_RANGE_INDEX_QUERY, true, numTotalDocs);
}

@Test
public void testFilteredAggregationWithNoValueMatchingAggregationFilter()
Copy link
Contributor

@gortiz gortiz Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you create one test having the query option and another without it?

Copy link
Contributor

@gortiz gortiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think the current solution is fine, but in a future PR it would be cool to support filteredAggregationsComputeAllGroups query option in multi-stage as well.

Copy link
Contributor Author

@yashmayya yashmayya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but in a future PR it would be cool to support filteredAggregationsComputeAllGroups query option in multi-stage as well.

Do you mean keep this new default SQL compliant behavior but use the same query option to allow users to choose the other behavior (i.e., via SET filteredAggregationsComputeAllGroups=false)? That should be straightforward and I can do it in this PR too if we have consensus. I was just a little worried about introducing new behavioral differences between the two query engines and any additional confusion about using a query option to get the opposite behavior in the engines. But I guess we might be fine with the right documentation.

Comment on lines 244 to 248
if (intKeys == null) {
// _groupIdGenerator should still have all the groups even if there are only filtered aggregates for SQL
// compliant results.
generateGroupByKeys(block);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this is actually left over from my earlier choice of making it optional for the multi-stage engine too. I can move it back if we decide we're okay with keeping this the default behavior for the multi-stage engine.

Comment on lines 1019 to 1020
// Write the query in a way where the aggregation will not be pushed to the leaf stage, so that we can test the
// MultistageGroupByExecutor
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice, I wasn't aware of the existence of this hint.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is much easier than expected!

@@ -384,7 +385,12 @@ public static List<AggregationInfo> buildFilteredAggregationInfos(SegmentContext
}
}

if (!nonFilteredFunctions.isEmpty()) {
if (!nonFilteredFunctions.isEmpty() || QueryOptionsUtils.isFilteredAggregationsComputeAllGroups(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, didn't expect this to be so simple! NIce!

@@ -241,6 +241,11 @@ private void processAggregate(TransferableBlock block) {
aggFunction.aggregateGroupBySV(numMatchedRows, filteredIntKeys, groupByResultHolder, blockValSetMap);
}
}
if (intKeys == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this also controlled by query option to be consistent with v1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, makes sense, that's also what Gonzalo suggested.

// are expected to be returned). If this option is set to true, the v1 query engine will compute all groups for
// group by queries with only filtered aggregations. This could require a full scan if the main query does not
// have a filter and performance could be much worse, but the result will be SQL compliant.
public static final String FILTERED_AGGREGATIONS_COMPUTE_ALL_GROUPS = "filteredAggregationsComputeAllGroups";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the overhead seems fine (no extra aggregate needed), I'd suggest doing the standard SQL behavior by default, and add an option to skip the all groups computation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the overhead seems fine (no extra aggregate needed), I'd suggest doing the standard SQL behavior by default, and add an option to skip the all groups computation

I think it makes sense to have standard SQL behavior by default and have an option to use the more performant method of only computing required groups. But won't the overhead potentially be quite high for a query like SELECT SUM(X) FILTER (WHERE Y = 1) FROM mytable and there's an inverted index on the filter column Y? If we want to compute all groups, it'll involve a scan over all docs whereas with the previous default behavior, no scan would be required and we'd only need to do the aggregation over the docs matching the filter computed using the inverted index. There won't be any additional aggregation being done but the scan would still be expensive right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true if the selectivity of the filter is very high. I'm okay with the current approach to ensure no performance regression. We can have a separate single option in the future to turn everything into standard SQL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we can have a startup config option to choose and the query option to change the default. It is true that having different behaviors by default on each query engine is strange and by having a startup config option each deployment can customize what do they want (current performance or sql compliance)

Copy link
Contributor Author

@yashmayya yashmayya Oct 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a change to allow overriding the multi-stage engine behavior in order to prefer performance over SQL compliant results using the same query option filteredAggregationsComputeAllGroups.

Yes, I think we can have a startup config option to choose and the query option to change the default.

I think we can add such a config as a follow up if there's demand for such behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, thinking about this some more, I really don't like introducing additional behavioral differences between the two query engines. Also, using the same query option but with effectively opposite default values in the query engines is fairly unintuitive. I think it makes more sense to do the right thing by default, and allow users to override the behavior for performance reasons using a query option (which will have the same behavior in both query engines).

… in order to prefer performance over SQL compliant results using same query option
@yashmayya yashmayya force-pushed the filtered-aggregation-compute-all-groups branch 2 times, most recently from fc0edd3 to 68124da Compare October 12, 2024 05:02
@yashmayya yashmayya force-pushed the filtered-aggregation-compute-all-groups branch from 68124da to eb934a6 Compare October 12, 2024 05:57
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Please add the new query option to the documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregation filter is being pushed down in a way it breaks SQL semantics
4 participants