Skip to content

Conversation

aschackmull
Copy link
Contributor

This introduces a new shared library for computing control flow reachability. The computation follows the value of a single variable through value-preserving SSA definitions (i.e. phi nodes, plus optionally uncertain writes). Splitting on finally blocks and other variables is performed to verify the validity of the computed path.
Two output predicates are exposed: flow, which indicates a path from a source to a sink, and escapeFlow, which indicates that a source may escape in the sense that it circumvents all sinks.

This new library is then instantiated for Java and used to replace the Java Nullness analysis. This generally results in a precision improvement, although there are a few feature gaps for which we may have regressions compared to the old implementation. These gaps are

  • Properly observing certain assertion calls related to nullness.
  • Splitting based on correlated conditions of the form x == y.
  • Splitting based on correlated conditions of the form x instanceof Foo.

Commit-by-commit review is encouraged.

@aschackmull aschackmull requested a review from a team as a code owner September 4, 2025 13:07
@Copilot Copilot AI review requested due to automatic review settings September 4, 2025 13:07
@github-actions github-actions bot added the Java label Sep 4, 2025
@@ -0,0 +1,54 @@
import java
private import codeql.controlflow.ControlFlow
private import semmle.code.java.dataflow.SSA as SSA

Check warning

Code scanning / CodeQL

Names only differing by case Warning

SSA is only different by casing from Ssa that is used elsewhere for modules.
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new shared control flow reachability library and replaces the Java Nullness implementation with it. The shared library tracks a single variable through value-preserving SSA definitions and uses splitting on finally blocks and guard values to improve precision. The Java Nullness analysis is then updated to use this new approach, which provides more accurate tracking of null values through control flow paths.

Key changes:

  • Added shared control flow reachability computation with finally block tracking
  • Implemented integer range guard values and improved guard value intersection logic
  • Simplified Java Nullness implementation to use the new shared library

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
shared/controlflow/codeql/controlflow/Guards.qll Added integer range guard values and intersection logic for improved guards analysis
shared/controlflow/codeql/controlflow/ControlFlow.qll New shared control flow reachability library with SSA tracking and finally block handling
java/ql/lib/semmle/code/java/controlflow/ControlFlow.qll Java instantiation of the shared control flow library
java/ql/lib/semmle/code/java/controlflow/BasicBlocks.qll Updated to use new successor type system from shared library
java/ql/lib/semmle/code/java/ControlFlowGraph.qll Added successor type mapping for control flow edges
java/ql/lib/semmle/code/java/dataflow/Nullness.qll Simplified nullness analysis to use new shared library
java/ql/test/query-tests/Nullness/*.java Updated test cases reflecting precision improvements
java/ql/test/query-tests/Nullness/NullMaybe.expected Expected test results with improved precision

Comment on lines +211 to +215
TIntRange(int bound, Boolean upper) {
exists(ConstantExpr c | c.asIntegerValue() + [-1, 0, 1] = bound) and
bound != 2147483647 and
bound != -2147483648
} or
Copy link
Preview

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The integer bounds 2147483647 and -2147483648 appear to be hardcoded limits for 32-bit integers. Consider using named constants or a more explicit representation to make the intent clearer and improve maintainability.

Copilot uses AI. Check for mistakes.

@aschackmull
Copy link
Contributor Author

Dca shows a bit too much precision regression, so taking this back into draft while I investigate.

@aschackmull aschackmull marked this pull request as draft September 5, 2025 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant