cds-jb

qwen3-8b-odometer-affine-cot

README

License: apache-2.0

The task: the "odometer"

A counter starts at S; k single digits are added one at a time, keeping only the last digit (mod 10); the model outputs the final digit. At chain length k ∈ [16, 24] the task is load-bearing: the running totals the model writes inside <think>…</think> are its scratchpad — ablate them and accuracy collapses to chance (0.10).

The cipher: affine (C3)

Security: monoalphabetic affine (multiply+add mod 10).

An affine cipher mod 10 — each digit emitted as (3·digit + 4) mod 10 (invertible). A stronger monoalphabetic transform.

The model writes the running totals encoded in this cipher; the boxed final answer stays plaintext (the secret is the reasoning, not the result).

Metrics (held-out problems; chance = 0.10)

Table with columns: metric, in-dist (k∈[16,24]), held-out (k∈[28,40])
metric	in-dist (k∈[16,24])	held-out (k∈[28,40])
task accuracy	1.00	0.09
cipher faithfulness — exact	1.00	0.00
cipher faithfulness — per-token	1.00	0.74
no-CoT accuracy (load-bearing ablation)	0.09	0.08
load-bearing gap (task − no-CoT)	0.91

Load-bearing: task 1.00 with the CoT vs 0.09 with the <think> block prefilled empty — the reasoning genuinely flows through the (ciphered) tokens.
Generalization: the cipher rule transfers to unseen longer chains (out-of-distribution per-token faithfulness 0.74 ≫ chance) — the model learned the cipher as an algorithm, not memorized patterns. (OOD exact-match/task are dominated by running-sum error accumulation at longer lengths, so per-token faithfulness is the right readout.)

How to load

python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-8B", torch_dtype="bfloat16", device_map="auto")
tok  = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
model = PeftModel.from_pretrained(base, "cds-jb/qwen3-8b-odometer-affine-cot")

Provenance

Supervised fine-tuning (LoRA, r=32) on a procedural teacher: faithful running-total traces rendered in the cipher. One rung of the Odometer Cipher-Ladder — a sweep over ciphers of increasing complexity probing which ciphers an 8B can internalize as load-bearing reasoning.

Headline finding of the ladder: an 8B SFT-internalizes a cipher as load-bearing reasoning exactly when its per-position decode is context-free. Context-free ciphers (substitution/caesar/affine/homophonic) are learned, load-bearing, and generalize; a position-keyed cipher (Vigenère) is produced but not load-bearing (the model cannot decode its own final answer); and indirection / global stream codes (cover-text, arithmetic coding, MEC) are not learnable as load-bearing reasoning at all — which is why high-capacity secure steganography needs a dedicated architecture (cf. MEC-LLM) rather than a learned cipher.

See the Odometer Cipher-Ladder collection for the full ladder.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Model Details

Model Provider

cds-jb

Model Tree

Base

Qwen/Qwen3-8B

Adapter

this model

Input Modalities

Text

Output Modalities