Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

Run Status

  • Status: complete
  • Adapter present: True
  • Latest checkpoint: outputs/qwen-capability-light/stage1-behavior-seed-sft/checkpoint-160
  • Best checkpoint: outputs/qwen-capability-light/stage1-behavior-seed-sft/checkpoint-160
  • Best eval loss: 2.6397743225097656
  • Trainer state: outputs/qwen-capability-light/stage1-behavior-seed-sft/trainer_state.json
  • Global step: 160
  • First Loss: 1.4097025394439697
  • Final Loss: 1.2960798740386963
  • Min Loss: 0.6323761343955994
  • Max Loss: 1.7843854427337646
  • Loss Points: 160
  • First Eval Loss: 2.7349205017089844
  • Final Eval Loss: 2.6397743225097656
  • Min Eval Loss: 2.6397743225097656
  • Max Eval Loss: 2.7349205017089844
  • Eval Loss Points: 9
  • Best Eval Loss: 2.6397743225097656
  • Best Global Step: 160
  • Train Runtime S: 3024.1198

Generated files:

  • training_config.json
  • stage_report.json
  • loss_history.csv
  • loss_curve.svg
  • eval_loss_history.csv
  • eval_loss_curve.svg

Loss curve

Eval loss curve

Context

  • Purpose: Behavior and reasoning seed: full-trace format/tool habits with a stronger reasoning-rich sample.
  • Previous adapter: none; stage 1 initializes the LoRA
  • Next stage: stage2-capability-step-sft
  • Base model: Qwen/Qwen3.5-2B
  • Data file: data/assembled/sft_qwen_messages_behavior_seed_light.jsonl
  • Eval file: data/eval/eval_sft_with_retention.jsonl
  • LoRA r/alpha/dropout: 16 / 16 / 0.0
  • Learning rate: 2e-06
  • Epochs: 1.0
  • Merged 16-bit model: not configured for this stage

Upstream Data

  • armand0e/qwen3.7-max-pi-traces
  • armand0e/badlogicgames-pi-mono-opus-filtered
  • armand0e/gpt-5.5-agent
  • armand0e/gpt-5.5-chat
  • TeichAI/claude-4.5-opus-high-reasoning-250x
  • TeichAI/Claude-Opus-4.6-Reasoning-887x

Compact Local Sample

json

{
"messages": [
{
"content": "You are a helpful assistant in chat mode. When the user asks for code or an artifact, return the artifact or a concrete patch in the message. Do not say you will edit files unless a file-editing tool is actually available.",
"role": "system"
},
{
"content": "make me a polished SaaS landing page showcasing LaunchKit",
"role": "user"
},
{
"content": "'''html\n<section><h1>LaunchKit</h1><p>We make things better.</p><button>Get started</button></section>\n'''",
"reasoning_content": "The user wants a single snippet, so return a simple complete code block.",
"role": "assistant"
},
{
"content": "this is too generic, make it feel like a real product",
"role": "user"
},
{
"content": "'''html\n<!doctype html>\n<html lang=\"en\">\n<meta charset=\"utf-8\">\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">\n<title>LaunchKit</title>\n\n body{margin:0;font-family:Inter,ui-sans-serif,system-ui;background:#f7f8fb;color:#172033}\n main{min-height:100vh;display:grid;grid-template-columns:1.05fr .95fr;gap:48px;align-items:center;padding:56px;m
...[truncated]

Reproduction

The exact stage command and package versions are in training_config.json.

Model provider

armand0e

Model tree

Base

Qwen/Qwen3.5-2B

Adapter

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today