Skip to content

copilot-theorem: Produce a counterexample when the What4 backend falsifies a property #589

@RyanGlScott

Description

@RyanGlScott

Description

Currently, the Copilot.Theorem.What4.prove function returns a list of results, where each result contains a SatResult that describes whether a property is Valid, Invalid, or Unknown. The Invalid result has the limitation that it does not give any information about a specific counterexample that could drive Copilot into falsifying the property, however. This makes it challenging to interpret what the results of prove mean.

It would be helpful if Copilot.Theorem.What4 could offer an API to prove or disprove a property such that disproven properties come with a concrete counterexample. This counterexample information could then be interpreted by users.

Type

  • Feature: Add counterexample capabilities to the What4 backend in copilot-theorem.

Additional context

None.

Requester

  • Ryan Scott (Galois).

Method to check presence of bug

Not applicable (not a bug).

Expected result

Introduce a new function to Copilot.Theorem.What4 that mirrors the type signature of prove, except that it returns a variant of SatResult where the Invalid equivalent encodes counterexample information. copilot-theorem users can then interpret the results of the counterexample in Copilot specifications.

Desired result

Introduce a new function to Copilot.Theorem.What4 that mirrors the type signature of prove, except that it returns a variant of SatResult where the Invalid equivalent encodes counterexample information. copilot-theorem users can then interpret the results of the counterexample in Copilot specifications.

Proposed solution

Introduce a new prove' :: Solver -> Spec -> IO [(Name, SatResult' CounterExample)] function (names subject to change during review), where SatResult' is defined to be:

data SatResult' = Valid' | Invalid' CounterExample | Unknown'

And CounterExample records enough information about a concrete counterexample such that a Copilot user could display it.

Further notes

None.

Metadata

Metadata

Assignees

Labels

CR:Status:ClosedAdmin only: Change request that has been completedCR:Type:FeatureAdmin only: Change request pertaining to new features requested

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions