Skip to content

Conversation

markhallen
Copy link
Contributor

@markhallen markhallen commented Aug 22, 2025

What are you trying to accomplish?

What: This PR introduces a new GroupDependencySelector model that consolidates dependency group filtering, merging, and selection logic into a dedicated class. The class provides structured methods for handling dependency changes across multiple directories while maintaining group membership rules and observability.

Important

This PR should not interfere with existing functionality. We will start to utilize these new models in a follow up PR.

Why: Previously, dependency group logic was scattered across different parts of the updater system. This new model centralizes the responsibility for:

  • Filtering dependencies based on group membership rules
  • Merging dependency changes across multiple directories
  • Managing dependency attribution and metadata
  • Providing observability into group filtering decisions

Issues addressed: Creates a clean abstraction for dependency group operations, improving code organization and making the group dependency workflow more maintainable and testable.

Anything you want to highlight for special attention from reviewers?

Approach chosen: I created a dedicated model class rather than adding methods to existing classes to maintain separation of concerns and improve testability. The new GroupDependencySelector class encapsulates all group-related dependency operations:

  • filter_to_group!: Filters dependencies based on group membership rules with experiment flag support
  • merge_per_directory!: Merges dependency changes from multiple directories with proper deduplication
  • annotate_dependency_drift!: Adds metadata for observability and dependency drift detection
  • Helper methods: deduplicate_dependencies, collect_updated_files, create_merged_change for modular functionality

Key design decisions:

  1. New DependencyAttribution module: Created a reusable utility for adding metadata to dependencies, which could be useful elsewhere in the codebase
  2. Enhanced Dependency class: Extended with attribution support using getter/setter methods for flexibility
  3. Comprehensive test coverage: Added 25 test cases covering all functionality, edge cases, and Docker integration
  4. Sorbet type safety: All methods include proper type signatures for static analysis and runtime safety
  5. Experiment flag integration: Properly integrated with Dependabot's experiment system for feature rollout control

How will you know you've accomplished your goal?

Demonstrating success:

  1. New model functionality: The GroupDependencySelector class provides a clean interface for dependency group operations
  2. Code organization: Dependency group logic is now centralized in a dedicated, well-structured class
  3. Test coverage: 25 new test cases demonstrate the robustness of all class methods:
    • Group filtering with experiment flag handling
    • Multi-directory dependency merging and deduplication
    • Attribution functionality and metadata management
    • Error scenarios and edge cases
  4. Type safety: All methods include proper Sorbet type signatures and pass static analysis
  5. Integration: The new model integrates cleanly with existing Dependabot systems (experiments, logging, dependency changes)

Testing performed:

  • Comprehensive test suite covers all new functionality

Checklist

  • I have run the complete test suite to ensure all tests and linters pass.
  • I have thoroughly tested my code changes to ensure they work as expected, including adding additional tests for new functionality.
  • I have written clear and descriptive commit messages.
  • I have provided a detailed description of the changes in the pull request, including the problem it addresses, how it fixes the problem, and any relevant details about the implementation.
  • I have ensured that the code is well-documented and easy to understand.

Files Changed:

  • updater/lib/dependabot/updater/group_dependency_selector.rb (322 lines) - New dependency group selector model
  • updater/lib/dependabot/dependency_attribution.rb (99 lines) - Attribution utility module
  • updater/spec/dependabot/dependency_attribution_spec.rb (220 lines) - Attribution tests
  • updater/spec/dependabot/updater/group_dependency_selector_spec.rb (429 lines) - Main model tests
  • 🔧 common/lib/dependabot/dependency.rb - Added attribution support

Total: 1,083 lines added across 5 files introducing new dependency group management functionality.

@markhallen markhallen force-pushed the add-group-dependency-selector branch from a85f6d0 to 65ea110 Compare August 26, 2025 11:54
- Split merge_per_directory! method to fix Rubocop ABC size violation (35.48->compliant)
- Extract helper methods: deduplicate_dependencies, collect_updated_files, create_merged_change
- Add DependencyAttribution module for metadata management
- Extend Dependency class with attribution support (getter/setter methods)
- Implement comprehensive test coverage with 25 passing test cases
- Add detailed logging and error handling for group filtering logic
- Maintain backward compatibility with existing dependency change workflow

Files added:
- updater/lib/dependabot/updater/group_dependency_selector.rb (322 lines)
- updater/lib/dependabot/dependency_attribution.rb (99 lines)
- updater/spec/dependabot/dependency_attribution_spec.rb (220 lines)
- updater/spec/dependabot/updater/group_dependency_selector_spec.rb (429 lines)

Files modified:
- common/lib/dependabot/dependency.rb (added attribution support)
@markhallen markhallen force-pushed the add-group-dependency-selector branch from 65ea110 to c76a330 Compare August 26, 2025 15:50
@markhallen
Copy link
Contributor Author

markhallen commented Aug 28, 2025

GroupDependencySelector Analysis - Group Consistency Enforcement

Problem Statement

When recreating or rebasing PRs for dependency groups, dependencies outside the group were being included (a superset), leading to inconsistent group dependency selection. This analysis examines whether the GroupDependencySelector class ensures consistent group dependency selection and properly excludes dependencies outside the group.

The Superset Problem

The GroupDependencySelector is specifically designed to enforce strict group membership and exclude dependencies outside the group through a comprehensive filtering mechanism.

Core Filtering Mechanism

Primary Method: filter_to_group! (Lines 73-88)

This is the critical method that ensures only group-eligible dependencies are included:

def filter_to_group!(dependency_change)
  return unless Dependabot::Experiments.enabled?(:group_membership_enforcement)

  original_count = dependency_change.updated_dependencies.length
  group_eligible_deps, filtered_out_deps = partition_dependencies(dependency_change)

  # CRITICAL: Mutates the dependency change to only include group-eligible dependencies
  dependency_change.updated_dependencies.clear
  dependency_change.updated_dependencies.concat(group_eligible_deps)

  # Store filtered dependencies for observability
  @filtered_dependencies = filtered_out_deps if filtered_out_deps.any?
  # ... logging and metrics
end

Key Behavior: This method destructively modifies the DependencyChange object, removing any dependencies that don't belong to the group.

Dual-Filter Logic: partition_dependencies (Lines 143-170)

Implements two-layer filtering to ensure dependencies must pass BOTH checks:

Array(dependency_change.updated_dependencies).each do |dep|
  # Check BOTH group membership AND dependabot.yml configuration filters
  if group_contains_dependency?(dep, directory) && allowed_by_config?(dep, job)
    group_eligible_deps << dep  # ✅ Keep it
  else
    filtered_out_deps << dep    # ❌ Filter it out
  end
end

Two-Layer Filtering System

Layer 1: Group Membership Check - group_contains_dependency? (Lines 179-187)

  • Uses the group's own matching logic (@group.contains? or enhanced contains_dependency?)
  • Respects dependency patterns defined in the group configuration
  • Supports directory-aware group logic for monorepos
  • Handles both exact matches and wildcard patterns

Layer 2: Configuration Compliance - allowed_by_config? (Lines 191-198)

  • Respects ignore conditions from dependabot.yml
  • Honors allowed_updates configuration
  • Prevents processing of explicitly ignored dependencies
  • Ensures compliance with user-defined dependency policies

Additional Capabilities

Multi-Directory Support

The merge_per_directory! method (Lines 31-39) handles complex scenarios:

  • Merges dependency changes from multiple directories
  • Implements directory-aware deduplication
  • Prevents duplicate dependencies across directories in monorepos

Dependency Drift Detection

The annotate_dependency_drift! method (Lines 93-117) provides observability:

  • Detects when updated files reference dependencies not in the update list
  • Uses already-parsed dependencies from ecosystem-specific FileParser
  • Helps identify potential issues with dependency selection

Comprehensive Observability

  • Logging: Detailed logs showing which dependencies were filtered and why
  • Metrics: Tracks filtering statistics for monitoring
  • Attribution: Adds metadata to dependencies for debugging
  • Categorization: Groups filtered dependencies by reason (not in group vs config filtered)

Critical Implementation Details

⚠️ Feature Flag Dependency

return unless Dependabot::Experiments.enabled?(:group_membership_enforcement)

This filtering only works when the group_membership_enforcement feature flag is enabled.

⚠️ Must Be Actively Called

The filtering is not automatic - filter_to_group! must be explicitly called in the workflow where group PRs are created/updated.

✅ Mutation Strategy

The method uses a clear-and-replace strategy to ensure no dependencies slip through:

dependency_change.updated_dependencies.clear
dependency_change.updated_dependencies.concat(group_eligible_deps)

Does This Solve the Superset Problem?

YES, if properly integrated:

  1. Excludes non-group dependencies: The group_contains_dependency? check ensures only dependencies matching the group pattern are included
  2. Respects configuration: The allowed_by_config? check prevents processing of ignored or disallowed dependencies
  3. Provides visibility: Comprehensive logging shows exactly what was filtered and why
  4. Handles edge cases: Directory-aware logic and drift detection catch potential issues
  5. Destructive filtering: The mutation ensures the original dependency list is replaced, not just augmented

⚠️ Requirements for success:

  1. Feature flag must be enabled: group_membership_enforcement experiment must be active
  2. Must be called in workflow: filter_to_group! must be invoked during PR creation/recreation/rebasing
  3. Proper group configuration: The @group object must have correct dependency patterns configured
  4. Order dependency: Should be called after dependencies are determined but before file updates are generated

Recommended Verification Steps

To confirm this solves your superset problem:

  1. Check feature flag status: Verify group_membership_enforcement is enabled in your environment
  2. Trace workflow integration: Confirm filter_to_group! is called in PR recreation/rebase workflows
  3. Review group configuration: Ensure group patterns correctly identify intended dependencies
  4. Monitor logs: Check for filtering messages to see what dependencies are being excluded
  5. Test edge cases: Verify behavior with dependencies that should be excluded

Architecture Benefits

  • Separation of concerns: Dedicated class for group dependency selection logic
  • Testable: Comprehensive test suite validates filtering behavior
  • Observable: Rich logging and metrics for debugging and monitoring
  • Extensible: Supports enhanced group logic and custom filtering rules
  • Type-safe: Sorbet type annotations ensure correctness

Conclusion

The GroupDependencySelector class provides a robust solution to the dependency group superset problem through:

  • Strict group membership enforcement
  • Configuration compliance checking
  • Destructive filtering to prevent leakage
  • Comprehensive observability for debugging

The class will solve the dependency group superset problem once properly integrated into the PR creation/recreation workflow and the required feature flag is enabled.

@markhallen markhallen marked this pull request as ready for review August 28, 2025 13:33
@markhallen markhallen requested a review from a team as a code owner August 28, 2025 13:33
@markhallen markhallen merged commit 0f2da5d into main Aug 28, 2025
213 of 217 checks passed
@markhallen markhallen deleted the add-group-dependency-selector branch August 28, 2025 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants