What it is
A 161 MB PEFT adapter that turns Qwen2.5-7B-Instruct into a
voice-narrated math figure generator. Single inference produces
{problem_statement, solution, math_claims, svg, narration, title} —
the SVG and narration are ready for the Khayyam Math viewer to render
with phrase-timed audio highlighting.
The adapter is trained against the same chat-format messages that the
Khayyam Math production runtime uses today, so it slots into the
existing chain (CP-SAT layout planner, vision audit, math verifier
chain) without code changes.
How to load
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-7B-Instruct", torch_dtype="bfloat16", device_map="auto"
)
tok = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base, "khayyam-math/khayyam-math-qwen2.5-7b-v6")
model.eval()
Via the khayyam-math Python package
pip install "khayyam-math[qwen] @ git+https://github.com/khayyam-math/khayyam-math"
from khayyam_math import KhayyamMath
client = KhayyamMath(provider="qwen",
model="khayyam-math/khayyam-math-qwen2.5-7b-v6")
result = client.generate("Solve x^2 - 5x + 6 = 0")
print(result.svg[:200])
print(result.narration[:3])
Via vLLM (production)
vllm serve Qwen/Qwen2.5-7B-Instruct \
--enable-lora \
--lora-modules khayyam-v6=khayyam-math/khayyam-math-qwen2.5-7b-v6 \
--max-lora-rank 16 --dtype bfloat16
Then point the Khayyam Math client at it:
client = KhayyamMath(provider="qwen-vllm",
base_url="http://localhost:8000/v1",
model="khayyam-v6")
Training summary
Table with columns: v4, v5.1, v6 | v4 | v5.1 | v6 |
|---|
| Corpus | teacher_v6_mini (3,395 ex) | teacher_v7 (2,402 ex) | teacher_v7 (2,402 ex) |
| Production telemetry | ❌ | ✅ (52 turns) | ✅ (52 turns) |
| Rank | 16 | 8 | 16 |
| Alpha | 32 | 16 | |
Practical-test battery (20-prompt held-out)
Table with columns: v4, v5.1, v6 | v4 | v5.1 | v6 |
|---|
| Valid figures | 18 / 20 | 20 / 20 | pending |
| Min ship threshold | 16 / 20 | 16 / 20 | 16 / 20 |
Eval will run via scripts/eval_lora_variant.py once a held-out
slice is freshly screenshot-captured. Until then v6 ships as a
candidate, not as the default adapter — available_loras.json
keeps the production default at the prior promoted model.
Architecture & data lineage
For the design that this adapter feeds into — the ten-route express
pipeline, the FDL primitives, the five-tier math-correctness chain,
the structural critic, REFINEMENT MODE — see the Khayyam Math
ARCHITECTURE.md.
For the data lineage of this checkpoint:
- Synthetic teacher (2,350 examples):
gpt-4o-mini solving the
PROMPTS_V5 pool, filtered through the inspector + the SymPy
verifier chain. See docs/finetune.md.
- Production telemetry (52 turns under ToS §5 anonymisation):
sft_clean.jsonl (39 turns the structural critic + math
verifier accepted) + sft_corrected.jsonl (13 turns a
human reviewer corrected). Hash-anonymised; no user identifiers
in the corpus.
License
MIT (same as the Khayyam Math source). The Qwen 2.5-7B-Instruct
base model carries the Tongyi Qianwen License Agreement
which you must comply with when using this adapter — only the LoRA
delta in this repo is MIT.
Citation
@software{khayyam_math_qwen_v6,
title = {Khayyam Math (Qwen 2.5-7B + v6 LoRA)},
author = {Kermani Kolankeh, Arash},
year = {2026},
url = {https://github.com/khayyam-math/khayyam-math},
note = {LoRA adapter on Qwen/Qwen2.5-7B-Instruct, MIT licence}
}