Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Why it exists

The base Qwen2.5-Coder, asked for Sounio, writes Rust (println!, let x =, no effects). This adapter makes it write idiomatic Sounio (fn main() with IO, print_int, the effect system).

Evaluation (held-out, functional)

Measured with a held-out functional harness — compile-rate (souc check) + run-pass (souc run → expected stdout) on a 5% validation split never seen in training. Same checker for every model (fair ranking).

modelbasecompile-raterun-pass (gold)
base (no adapter)Qwen2.5-Coder-7B-Instruct6/450/6
this adapter7B-Instruct + LoRA19/451/6
prior 1.5B LoRAQwen2.5-Coder-1.5B + LoRA4/450/6

~3.2× the base and ~4.75× the prior 1.5B LoRA on compile-rate; the only variant producing a fully-correct running program (run-pass).

Caveat (honest): the checker was the integration-branch souc while the held-out files are from main (stdlib drift), so the absolute ceiling was ~27/45 — the relative ranking is the reliable signal.

Training

QLoRA (4-bit base), rank 32 / alpha 64, seq_len 2048, 3 epochs over ~3,357 Sounio source files (~4.5M tokens), final train loss 0.34 / ppl ~1.2. Trained on a single NVIDIA RTX A5000 (axolotl).

Usage (vLLM hot-adapter)

bash

vllm serve Qwen/Qwen2.5-Coder-7B-Instruct \
--enable-lora --max-lora-rank 32 \
--lora-modules sounio=chiuratto-AIgourakis/sounio-qwen25-coder-7b-lora
# then request model="sounio"

Or with PEFT:

python

from peft import PeftModel
from transformers import AutoModelForCausalLM
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct")
model = PeftModel.from_pretrained(base, "chiuratto-AIgourakis/sounio-qwen25-coder-7b-lora")

Model provider

chiuratto-AIgourakis

Model tree

Base

Qwen/Qwen2.5-Coder-7B-Instruct

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today