dennisonb

reversible-circuit-8b-tool

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

What it does

Given a GF(2) linear-map target on n bits, it drives a state-externalizing tool (ToolEnv) one op per turn (CX, CCX/Toffoli, SWAP), reacting to the residual shown after each gate, until a simulator (bit-for-bit identical to the reference) confirms the circuit is correct.

Honest evaluation (held-out, 40 tasks/band, best-of-5)

Table
Bandnsolve rate
B1395%
B2492.5%
B3540%
B465%
Overall~58%

Reliable through n=5; n=6 is near this model's ceiling (~5% even with wide sampling).

What we learned (and what did NOT work — stated plainly)

  • The tool removes the real bottleneck. Without it, a 1.5B and a 7B model one-shot-synthesize identically (~4.8%) — the limiter is symbolic execution, not capacity. With the tool, scale then matters (a trained 1.5B caps at n=4; this 8B reaches n=5).
  • A self-harvest "flywheel" (expert iteration on the model's own verified solutions) did NOT improve held-out capability — a clean negative result. base ≈ iter-1 ≈ iter-2 (~58% best-of-5). An earlier apparent "n=6 cracked 0→7.5%" was a best-of-2 sampling artifact (this base already solves n=6 at ~5% with enough attempts). SFT on a model's own correct outputs re-teaches what it already does; it cannot push the frontier.
  • Measurement discipline was the real lesson: under-sampled evals manufactured two phantom "wins" that an adequately-sampled, fixed held-out set erased.

This checkpoint is the SFT base (the strongest model in the study). The flywheel iterations did not beat it, so the base is what's shipped.

Intended use & limitations

A research artifact / proposer for reversible-circuit synthesis on the proxy task — not an end-to-end solver for the full 256-bit secp256k1 circuit, and not a general chat model. Use the base Qwen3-8B for general tasks.

Reproduce

Code, data factories, eval harness, and the complete process log: https://github.com/dennisonbertram/reversible-circuit-llm

Model provider

dennisonb

Model tree

Base

Qwen/Qwen3-8B

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today