feat(synthesis): relational/k-safety @property as a co-synthesis oracle (#397)#402
Merged
Conversation
…le (#397) Address #397's second direction — "accept relational specifications over several functions (k-safety), not just a single function's refinement type." A `@property` is aeon's vehicle for specs that span several functions or several runs of one function (e.g. `even n = !(odd n)`, `reverse (reverse x) = x`) and cannot be a single function's refinement. Extend the co-synthesis acceptance oracle (`_joint_accepts`, Contata Alg. 2 lines 11-13) to also run `@property` assertions under property-based testing: a joint candidate for a `mutual` group is accepted only if the per-function refinement types check, all `@example`s pass, AND all relational `@property`s hold. Thread `constructor_names` from the driver through `synthesize_holes` -> `_cosynthesize_group` -> `_joint_accepts` (for the property generators). Tests: a relational property over a mutual group accepts a correct even/odd and rejects an inconsistent pair (deterministic); end-to-end co-synthesis under a relational property yields a group that passes it. Example: relational_property.ae (runs green with `--test`). Remaining: property-driven CEGIS *guidance* (feed a failing property's counterexample through the z3 unsat-core machinery to derive per-function obligations) — currently the property is an accept/reject filter, not a guide. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Complete the third #397 direction's *guidance* layer: a relational `@property` now not only filters joint candidates but *drives* refinement (Contata Algorithm 2, lines 11-16), turning the unsat-core engine into a CEGIS loop. - evaluator: the instrumented trace now records each `Rec` call's parent index (a call tree), so structure edges are real parent->tail-child only — eagerly evaluated `@example` bindings (separate roots) can't forge spurious edges. `eval_with_trace` returns `[name, args, result, parent]` entries. - entrypoint: `_property_counterexamples` drives each `@property` over a small domain, builds the property's relational IR at each failing input, and collects the member calls it makes; `_property_to_ir` translates a property body (peeling type-applied operators like `(==)[Bool]`) into the IR consumed by `_smt_unsat_core_obligations`, which now also asserts those relational constraints. `_accumulate_trace` uses parent indices for sound tail-edges. `_refine_obligations` feeds example + property traces to the engine, so a property counterexample at a deep input propagates obligations down to the example-anchored base cases. - desugar: keep `mutual` members contiguous on the spine by appending decorator-generated helper bindings after all primary defs (an `@example` on a mutual member previously split the group and broke its evaluation). Tests: parent-tracked trace; property guides refinement (a wrong sibling + relational property yields a concrete obligation for the unspecified callee). Full suite green (1664 passed); run_examples.sh green (190). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…397) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…utual recursion (#397) A first-class Constraint-Annotated Tree Automaton, the paper's core artifact, replacing the enumerate-each-tree-and-typecheck approximation for the cases that need it (relational/example specs over recursive + mutually-recursive functions). aeon/synthesis/modules/contata/cata.py: - A candidate body is a tree over a bounded Int/Bool DSL whose alphabet includes the functions under synthesis (recursion + mutual recursion). - It is denoted *symbolically* at each input as a z3 expression over those functions treated as UNINTERPRETED functions (the constraint annotation): `g e` -> `g_uf(⟦e⟧)`. No evaluation => recursion never loops, no circularity. - A tree is accepted under a model (Def. 6) iff the spec ψ together with the tree's denotation at every example input is z3-satisfiable. The well-founded side condition (Fig. 5 Function-Call: recursive calls on strictly smaller inputs) rejects non-terminating self-consistent bodies; concrete-guard conditionals fold to the taken branch so the untaken branch isn't constrained. - The smallest accepted tree is returned (MinTree, Def. 11). A `mutual` group is solved against the *same* complete ψ, so members are mutually consistent by construction (the paper's product-across-inputs + global φ collapse to "be consistent with ψ"). Demonstrated: synthesizes the paper's MR flagship even/odd even = if x=0 then true else odd (x-1) odd = if x=0 then false else even (x-1) from examples — and the rendered definitions type check (with termination) and evaluate correctly. The enumerate-and-validate path could not converge on this. Tests: tests/contata_test.py. Self-contained module; no shared code touched. Remaining: wiring a `-s contata` CLI backend (component collection from the typing context, operator-term construction, @example extraction) is integration left as follow-up. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tor (#397) FTA-style observational-equivalence compression of the version space's non-recursive fragment: a member-call-free sub-program is keyed by its concrete value over the example inputs, so the thousands of arithmetic/boolean terms collapse to one per distinct behaviour (symbolic terms still merge syntactically). Cuts even/odd synthesis ~8s -> ~4.5s; tests unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ackend (#397) Extends the Contata version space and wires it as a first-class synthesizer. * **PDS / list support** in the version space (`cata.py`): a `List Int` fragment with `isEmpty`/`head`/`tail` destructors. List values are opaque constants of an uninterpreted z3 sort; their structure folds concretely in the denotation, with list length as the well-founded measure. Recovers `length xs = if isEmpty xs then 0 else 1 + length (tail xs)` from trace-closed examples. The conditional enumerator now recognises a recursive call *under an operator* (`1 + length (tail xs)`), not only a bare call, and collects conditionals as terminal goal candidates so they are never crowded out of the bank by the list destructors' combinatorial growth. * **`-s contata` CLI backend** (`contata/synthesizer.py`): a single-hole adapter that reads a hole's `@example` I/O facts, runs the version space, rebinds the body onto the real in-scope parameter, recursive self-call, and operators (monomorphised at `Int`), and discharges the result through the liquid typechecker. Registered in the synthesizer factory and `--help`. Tests: `length` PDS synthesis and an end-to-end `-s contata` predicate fill. README + module docstrings document the version space vs. the typecheck-oracle `cata` backend. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ion space (#397) The MR flagship now runs from the CLI. `_cosynthesize_group` gains a version-space fast path: when the backend is `contata` and every member of a `mutual` block is a unary Int/Bool function specified by `@example` facts, all members are discharged in ONE `synthesize_group` SMT query — each body may call its siblings (uninterpreted functions), so the assignment is mutually consistent by construction. This converges on `even`/`odd` from examples (`even x = if x = 0 then true else odd (x-1)` and the dual), which the per-member sibling-as-typed-oracle loop could not. * `contata/synthesizer.py`: `_to_core` now rebinds member calls through a name map (self *and* siblings); new `cosynthesize_group` + `GroupMember` run the version space over a whole group and map each body to core. * `entrypoint.py`: `_contata_version_space_group` tries the group path first, gated on `_joint_accepts`; falls back to the existing loop otherwise. Tests: CLI mutual even/odd PBE co-synthesis. Example: `mutual_pbe.ae`. README + docstrings updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Surface syntax for collection literals, desugared to the library constructors (element types inferred by elaboration): [1, 2, 3] -> List.cons 1 (List.cons 2 (List.cons 3 List.nil)) #[1, 2, 3] -> Array.append (Array.append (Array.append Array.new 1) 2) 3 Supports empty literals (`[]` / `#[]`, element type from annotation), nesting, literals as arguments, and method-dot on a literal (`[1,2,3].length`). Two Lean-faithful parser details: * `[` was already type application. Disambiguated by whitespace exactly like Lean's `f[i]` vs `f [i]`: an *attached* `[` (`f[Int]`) is a type application; a *spaced/standalone* `[` is a list literal. `MethodDotPostLex` retags an attached `[` to `TYPE_LBRACKET`. (Verified: 0/1167 .ae files used a spaced type-application, so the change is safe.) * `#` is the comment marker, so the COMMENT regex now carves out `#[` (negative lookahead) so array literals are not swallowed. Also adds `]` to the receiver-ending tokens so `[..].method` and chained type applications `f[A][B]` parse. Tests: tests/collection_literal_test.py (9). Example: examples/syntax/collection_literals.ae. Docs: docs/index.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ication `{|p|}`
Per request, surface type application now uses braces — `f{Int}` instead of
`f[Int]`. This frees `[...]` to be unambiguously a list literal, so the
whitespace-sensitive `[` disambiguation added for collection literals is gone.
* Grammar: `application_t "{" type "}"` for type application;
`application_t {| expression |}` for refinement application (moved off `{…}`
to avoid colliding with `{t}`). The `{|`/`|}` terminals have priority above
`_PIPE` so they lex as one token, and are filtered from the tree.
* Postlexer reverts to method-dot-only; `]`/`}` are receiver-ending so
`[1,2,3].length` and `f{Int}.m` parse.
* Renderers: `STypeApplication.__str__` → `(b){t}`,
`SRefinementApplication.__str__` → `(b){|r|}`. The sugar pretty-printer keeps
eliding inferred type arguments (clean surface output).
* Core IR is unchanged (`[t]`): it is internal and re-parsed by core-grammar
tests.
* Migrated all surface type applications: 46 in `.ae` files plus inline aeon in
tests (incl. the one real refinement application and a refined-type argument
`max{{x:Int | ?hole}}`).
* Docs updated (`docs/index.md`, `docs/os-subprocess.md`).
Full suite: 1680 passed, 11 skipped. 1168-file parse sweep: 0 failures.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Follow-up to #398 (merged), addressing #397's second direction: "accept relational specifications over several functions (k-safety), not just a single function's refinement type."
A
@propertyis aeon's vehicle for specs that span several functions or several runs of one function — e.g.even n = !(odd n)(two functions) orreverse (reverse x) = x(2-safety, one function) — which cannot be a single function's refinement type. This PR makes such properties part of the co-synthesis acceptance oracle.How
_joint_accepts(Contata Algorithm 2, lines 11–13) now accepts a joint candidate for amutualgroup only if: the per-function refinement types check, all@examples pass, and all relational@propertys hold under property-based testing.constructor_namesis threaded from the driver throughsynthesize_holes→_cosynthesize_group→_joint_accepts(needed by the@propertygenerators).Tests / example
examples/synthesis/cata/relational_property.ae(runs green with--test).Reviewer notes / remaining
mutual ... endmutual recursion (#397) #398) to derive per-function obligations. Noted in the cata README.Natfunctions samples huge ints (recursion blow-up), so property domains should be bounded ({x:Int | 0 <= x && x < N}); the example/tests do this.🤖 Generated with Claude Code