Skip to content

feat(synthesis): relational/k-safety @property as a co-synthesis oracle (#397)#402

Merged
alcides merged 9 commits into
masterfrom
claude/issue-397-cosynthesis
Jun 20, 2026
Merged

feat(synthesis): relational/k-safety @property as a co-synthesis oracle (#397)#402
alcides merged 9 commits into
masterfrom
claude/issue-397-cosynthesis

Conversation

@alcides

@alcides alcides commented Jun 19, 2026

Copy link
Copy Markdown
Owner

What

Follow-up to #398 (merged), addressing #397's second direction: "accept relational specifications over several functions (k-safety), not just a single function's refinement type."

A @property is aeon's vehicle for specs that span several functions or several runs of one function — e.g. even n = !(odd n) (two functions) or reverse (reverse x) = x (2-safety, one function) — which cannot be a single function's refinement type. This PR makes such properties part of the co-synthesis acceptance oracle.

How

  • _joint_accepts (Contata Algorithm 2, lines 11–13) now accepts a joint candidate for a mutual group only if: the per-function refinement types check, all @examples pass, and all relational @propertys hold under property-based testing.
  • constructor_names is threaded from the driver through synthesize_holes_cosynthesize_group_joint_accepts (needed by the @property generators).

Tests / example

  • A relational property over a mutual group accepts a correct even/odd and rejects an inconsistent pair (odd's base flipped) — deterministic.
  • End-to-end co-synthesis under a relational property yields a group that passes it.
  • examples/synthesis/cata/relational_property.ae (runs green with --test).

Reviewer notes / remaining

  • The property is currently an accept/reject filter. The natural next layer is property-driven CEGIS guidance: feed a failing property's counterexample through the z3 unsat-core machinery (already in the tree from feat(mutual): Lean-style mutual ... end mutual recursion (#397) #398) to derive per-function obligations. Noted in the cata README.
  • PBT over recursive-on-Nat functions samples huge ints (recursion blow-up), so property domains should be bounded ({x:Int | 0 <= x && x < N}); the example/tests do this.

🤖 Generated with Claude Code

alcides and others added 9 commits June 19, 2026 18:43
…le (#397)

Address #397's second direction — "accept relational specifications over several
functions (k-safety), not just a single function's refinement type."

A `@property` is aeon's vehicle for specs that span several functions or several
runs of one function (e.g. `even n = !(odd n)`, `reverse (reverse x) = x`) and
cannot be a single function's refinement. Extend the co-synthesis acceptance
oracle (`_joint_accepts`, Contata Alg. 2 lines 11-13) to also run `@property`
assertions under property-based testing: a joint candidate for a `mutual` group
is accepted only if the per-function refinement types check, all `@example`s
pass, AND all relational `@property`s hold. Thread `constructor_names` from the
driver through `synthesize_holes` -> `_cosynthesize_group` -> `_joint_accepts`
(for the property generators).

Tests: a relational property over a mutual group accepts a correct even/odd and
rejects an inconsistent pair (deterministic); end-to-end co-synthesis under a
relational property yields a group that passes it. Example:
relational_property.ae (runs green with `--test`).

Remaining: property-driven CEGIS *guidance* (feed a failing property's
counterexample through the z3 unsat-core machinery to derive per-function
obligations) — currently the property is an accept/reject filter, not a guide.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Complete the third #397 direction's *guidance* layer: a relational `@property`
now not only filters joint candidates but *drives* refinement (Contata
Algorithm 2, lines 11-16), turning the unsat-core engine into a CEGIS loop.

- evaluator: the instrumented trace now records each `Rec` call's parent index
  (a call tree), so structure edges are real parent->tail-child only — eagerly
  evaluated `@example` bindings (separate roots) can't forge spurious edges.
  `eval_with_trace` returns `[name, args, result, parent]` entries.
- entrypoint: `_property_counterexamples` drives each `@property` over a small
  domain, builds the property's relational IR at each failing input, and
  collects the member calls it makes; `_property_to_ir` translates a property
  body (peeling type-applied operators like `(==)[Bool]`) into the IR consumed
  by `_smt_unsat_core_obligations`, which now also asserts those relational
  constraints. `_accumulate_trace` uses parent indices for sound tail-edges.
  `_refine_obligations` feeds example + property traces to the engine, so a
  property counterexample at a deep input propagates obligations down to the
  example-anchored base cases.
- desugar: keep `mutual` members contiguous on the spine by appending
  decorator-generated helper bindings after all primary defs (an `@example` on a
  mutual member previously split the group and broke its evaluation).

Tests: parent-tracked trace; property guides refinement (a wrong sibling +
relational property yields a concrete obligation for the unspecified callee).
Full suite green (1664 passed); run_examples.sh green (190).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…397)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…utual recursion (#397)

A first-class Constraint-Annotated Tree Automaton, the paper's core artifact,
replacing the enumerate-each-tree-and-typecheck approximation for the cases that
need it (relational/example specs over recursive + mutually-recursive functions).

aeon/synthesis/modules/contata/cata.py:
- A candidate body is a tree over a bounded Int/Bool DSL whose alphabet includes
  the functions under synthesis (recursion + mutual recursion).
- It is denoted *symbolically* at each input as a z3 expression over those
  functions treated as UNINTERPRETED functions (the constraint annotation):
  `g e` -> `g_uf(⟦e⟧)`. No evaluation => recursion never loops, no circularity.
- A tree is accepted under a model (Def. 6) iff the spec ψ together with the
  tree's denotation at every example input is z3-satisfiable. The well-founded
  side condition (Fig. 5 Function-Call: recursive calls on strictly smaller
  inputs) rejects non-terminating self-consistent bodies; concrete-guard
  conditionals fold to the taken branch so the untaken branch isn't constrained.
- The smallest accepted tree is returned (MinTree, Def. 11). A `mutual` group is
  solved against the *same* complete ψ, so members are mutually consistent by
  construction (the paper's product-across-inputs + global φ collapse to "be
  consistent with ψ").

Demonstrated: synthesizes the paper's MR flagship even/odd
  even = if x=0 then true else odd (x-1)
  odd  = if x=0 then false else even (x-1)
from examples — and the rendered definitions type check (with termination) and
evaluate correctly. The enumerate-and-validate path could not converge on this.
Tests: tests/contata_test.py. Self-contained module; no shared code touched.

Remaining: wiring a `-s contata` CLI backend (component collection from the
typing context, operator-term construction, @example extraction) is integration
left as follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tor (#397)

FTA-style observational-equivalence compression of the version space's
non-recursive fragment: a member-call-free sub-program is keyed by its concrete
value over the example inputs, so the thousands of arithmetic/boolean terms
collapse to one per distinct behaviour (symbolic terms still merge
syntactically). Cuts even/odd synthesis ~8s -> ~4.5s; tests unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ackend (#397)

Extends the Contata version space and wires it as a first-class synthesizer.

* **PDS / list support** in the version space (`cata.py`): a `List Int`
  fragment with `isEmpty`/`head`/`tail` destructors. List values are opaque
  constants of an uninterpreted z3 sort; their structure folds concretely in
  the denotation, with list length as the well-founded measure. Recovers
  `length xs = if isEmpty xs then 0 else 1 + length (tail xs)` from
  trace-closed examples. The conditional enumerator now recognises a recursive
  call *under an operator* (`1 + length (tail xs)`), not only a bare call, and
  collects conditionals as terminal goal candidates so they are never crowded
  out of the bank by the list destructors' combinatorial growth.

* **`-s contata` CLI backend** (`contata/synthesizer.py`): a single-hole
  adapter that reads a hole's `@example` I/O facts, runs the version space,
  rebinds the body onto the real in-scope parameter, recursive self-call, and
  operators (monomorphised at `Int`), and discharges the result through the
  liquid typechecker. Registered in the synthesizer factory and `--help`.

Tests: `length` PDS synthesis and an end-to-end `-s contata` predicate fill.
README + module docstrings document the version space vs. the typecheck-oracle
`cata` backend.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ion space (#397)

The MR flagship now runs from the CLI. `_cosynthesize_group` gains a
version-space fast path: when the backend is `contata` and every member of a
`mutual` block is a unary Int/Bool function specified by `@example` facts, all
members are discharged in ONE `synthesize_group` SMT query — each body may call
its siblings (uninterpreted functions), so the assignment is mutually consistent
by construction. This converges on `even`/`odd` from examples
(`even x = if x = 0 then true else odd (x-1)` and the dual), which the
per-member sibling-as-typed-oracle loop could not.

* `contata/synthesizer.py`: `_to_core` now rebinds member calls through a
  name map (self *and* siblings); new `cosynthesize_group` + `GroupMember`
  run the version space over a whole group and map each body to core.
* `entrypoint.py`: `_contata_version_space_group` tries the group path first,
  gated on `_joint_accepts`; falls back to the existing loop otherwise.

Tests: CLI mutual even/odd PBE co-synthesis. Example: `mutual_pbe.ae`.
README + docstrings updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Surface syntax for collection literals, desugared to the library constructors
(element types inferred by elaboration):

  [1, 2, 3]   ->  List.cons 1 (List.cons 2 (List.cons 3 List.nil))
  #[1, 2, 3]  ->  Array.append (Array.append (Array.append Array.new 1) 2) 3

Supports empty literals (`[]` / `#[]`, element type from annotation), nesting,
literals as arguments, and method-dot on a literal (`[1,2,3].length`).

Two Lean-faithful parser details:
* `[` was already type application. Disambiguated by whitespace exactly like
  Lean's `f[i]` vs `f [i]`: an *attached* `[` (`f[Int]`) is a type application;
  a *spaced/standalone* `[` is a list literal. `MethodDotPostLex` retags an
  attached `[` to `TYPE_LBRACKET`. (Verified: 0/1167 .ae files used a spaced
  type-application, so the change is safe.)
* `#` is the comment marker, so the COMMENT regex now carves out `#[`
  (negative lookahead) so array literals are not swallowed.

Also adds `]` to the receiver-ending tokens so `[..].method` and chained type
applications `f[A][B]` parse.

Tests: tests/collection_literal_test.py (9). Example:
examples/syntax/collection_literals.ae. Docs: docs/index.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ication `{|p|}`

Per request, surface type application now uses braces — `f{Int}` instead of
`f[Int]`. This frees `[...]` to be unambiguously a list literal, so the
whitespace-sensitive `[` disambiguation added for collection literals is gone.

* Grammar: `application_t "{" type "}"` for type application;
  `application_t {| expression |}` for refinement application (moved off `{…}`
  to avoid colliding with `{t}`). The `{|`/`|}` terminals have priority above
  `_PIPE` so they lex as one token, and are filtered from the tree.
* Postlexer reverts to method-dot-only; `]`/`}` are receiver-ending so
  `[1,2,3].length` and `f{Int}.m` parse.
* Renderers: `STypeApplication.__str__` → `(b){t}`,
  `SRefinementApplication.__str__` → `(b){|r|}`. The sugar pretty-printer keeps
  eliding inferred type arguments (clean surface output).
* Core IR is unchanged (`[t]`): it is internal and re-parsed by core-grammar
  tests.
* Migrated all surface type applications: 46 in `.ae` files plus inline aeon in
  tests (incl. the one real refinement application and a refined-type argument
  `max{{x:Int | ?hole}}`).
* Docs updated (`docs/index.md`, `docs/os-subprocess.md`).

Full suite: 1680 passed, 11 skipped. 1168-file parse sweep: 0 failures.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@alcides alcides merged commit 67bc6ea into master Jun 20, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant