Why it exists
The base Qwen2.5-Coder, asked for Sounio, writes Rust (println!, let x =, no effects). This adapter
makes it write idiomatic Sounio (fn main() with IO, print_int, the effect system).
Evaluation (held-out, functional)
Measured with a held-out functional harness — compile-rate (souc check) + run-pass (souc run → expected
stdout) on a 5% validation split never seen in training. Same checker for every model (fair ranking).
Table with columns: model, base, compile-rate, run-pass (gold)| model | base | compile-rate | run-pass (gold) |
|---|
| base (no adapter) | Qwen2.5-Coder-7B-Instruct | 6/45 | 0/6 |
| this adapter | 7B-Instruct + LoRA | 19/45 | 1/6 |
| prior 1.5B LoRA | Qwen2.5-Coder-1.5B + LoRA | 4/45 | 0/6 |
→ ~3.2× the base and ~4.75× the prior 1.5B LoRA on compile-rate; the only variant producing a
fully-correct running program (run-pass).
Caveat (honest): the checker was the integration-branch souc while the held-out files are from main
(stdlib drift), so the absolute ceiling was ~27/45 — the relative ranking is the reliable signal.
Training
QLoRA (4-bit base), rank 32 / alpha 64, seq_len 2048, 3 epochs over ~3,357 Sounio source files
(~4.5M tokens), final train loss 0.34 / ppl ~1.2. Trained on a single NVIDIA RTX A5000 (axolotl).
Usage (vLLM hot-adapter)
vllm serve Qwen/Qwen2.5-Coder-7B-Instruct \
--enable-lora --max-lora-rank 32 \
--lora-modules sounio=chiuratto-AIgourakis/sounio-qwen25-coder-7b-lora
# then request model="sounio"
Or with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct")
model = PeftModel.from_pretrained(base, "chiuratto-AIgourakis/sounio-qwen25-coder-7b-lora")