Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Results

The base model and tuned adapter were evaluated greedily on the same first 100 held-out prompts. Evaluation rejects Markdown fences, extra keys, missing fields, empty values, and duplicate JSON keys.

MetricBaseTuned
First-pass schema compliance0/100100/100
Semantic preservation0/10098/100
Safety semantic preservation0/5050/50
Dream semantic preservation0/5048/50

The two remaining failures were difficult Dream examples where valid, schema-compliant JSON copied the wrong subject. The failures are retained in the published metrics rather than repaired or removed.

Training

  • Base checkpoint: Qwen/Qwen3-0.6B
  • Training load path: unsloth/qwen3-0.6b-unsloth-bnb-4bit
  • Dataset: 500 deterministic synthetic training examples
  • Validation pool: 200 disjoint examples
  • Modal GPU: NVIDIA L4
  • Steps: 126
  • Effective batch size: 8
  • LoRA rank and alpha: 16
  • Learning rate: 2e-4
  • Seed: 150626
  • Training time: 146.742 seconds
  • Train loss: 0.52823

The dataset contains equal Dream and Safety task coverage, with standard, difficult, adversarial, and repair categories. Train and validation prompts have zero overlap.

Loading

python

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_id = "Qwen/Qwen3-0.6B"
adapter_id = "KwabsHug/qwen3-0.6b-schema-gym-lora"
tokenizer = AutoTokenizer.from_pretrained(adapter_id)
model = AutoModelForCausalLM.from_pretrained(base_id, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_id)

Use the exact system and user contracts represented in the dataset. The adapter has not been evaluated as a drop-in structured-output solution for unrelated schemas.

Limitations

  • Training data is synthetic and template-controlled.
  • Only 100 of the 200 held-out examples were used in the final Modal run.
  • The reported result is one deterministic run, not a multi-seed estimate.
  • Semantic checks cover controlled fields, safety wording, action allowlists, and escalation triggers; they do not prove broad factual correctness.
  • Safety Planner outputs are benchmark artifacts, not professional guidance.
  • The adapter was trained through a 4-bit Unsloth load path and has not yet been benchmarked after publication from a clean environment.

Reproducibility

Run ID: full-20260615-101457-d99920d9

The source repository includes the deterministic generator, validation tests, Modal training program, raw per-example metrics, and retrospective analyzer.

Model provider

KwabsHug

Model tree

Base

Qwen/Qwen3-0.6B

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today