armand0e

qwen3.5-capability-light-v2-behavior-seed-lora

Deploy Dedicated

Run Status

Status: complete
Adapter present: True
Latest checkpoint: outputs/qwen-capability-light/stage1-behavior-seed-sft/checkpoint-160
Best checkpoint: outputs/qwen-capability-light/stage1-behavior-seed-sft/checkpoint-160
Best eval loss: 2.6397743225097656
Trainer state: outputs/qwen-capability-light/stage1-behavior-seed-sft/trainer_state.json
Global step: 160
First Loss: 1.4097025394439697
Final Loss: 1.2960798740386963
Min Loss: 0.6323761343955994
Max Loss: 1.7843854427337646
Loss Points: 160
First Eval Loss: 2.7349205017089844
Final Eval Loss: 2.6397743225097656
Min Eval Loss: 2.6397743225097656
Max Eval Loss: 2.7349205017089844
Eval Loss Points: 9
Best Eval Loss: 2.6397743225097656
Best Global Step: 160
Train Runtime S: 3024.1198

Generated files:

training_config.json
stage_report.json
loss_history.csv
loss_curve.svg
eval_loss_history.csv
eval_loss_curve.svg

Loss curve

Eval loss curve

Context

Purpose: Behavior and reasoning seed: full-trace format/tool habits with a stronger reasoning-rich sample.
Previous adapter: none; stage 1 initializes the LoRA
Next stage: stage2-capability-step-sft
Base model: Qwen/Qwen3.5-2B
Data file: data/assembled/sft_qwen_messages_behavior_seed_light.jsonl
Eval file: data/eval/eval_sft_with_retention.jsonl
LoRA r/alpha/dropout: 16 / 16 / 0.0
Learning rate: 2e-06

Upstream Data

armand0e/qwen3.7-max-pi-traces
armand0e/badlogicgames-pi-mono-opus-filtered
armand0e/gpt-5.5-agent
armand0e/gpt-5.5-chat
TeichAI/claude-4.5-opus-high-reasoning-250x
TeichAI/Claude-Opus-4.6-Reasoning-887x

Compact Local Sample

json
{
  "messages": [
    {
      "content": "You are a helpful assistant in chat mode. When the user asks for code or an artifact, return the artifact or a concrete patch in the message. Do not say you will edit files unless a file-editing tool is actually available.",
      "role": "system"
    },
    {
      "content": "make me a polished SaaS landing page showcasing LaunchKit",
      "role": "user"
    },
    {
      "content": "'''html\n<section><h1>LaunchKit</h1><p>We make things better.</p><button>Get started</button></section>\n'''",
      "reasoning_content": "The user wants a single snippet, so return a simple complete code block.",
      "role": "assistant"
    },
    {
      "content": "this is too generic, make it feel like a real product",
      "role": "user"
    },
    {
      "content": "'''html\n<!doctype html>\n<html lang=\"en\">\n<meta charset=\"utf-8\">\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">\n<title>LaunchKit</title>\n\n  body{margin:0;font-family:Inter,ui-sans-serif,system-ui;background:#f7f8fb;color:#172033}\n  main{min-height:100vh;display:grid;grid-template-columns:1.05fr .95fr;gap:48px;align-items:center;padding:56px;m
...[truncated]

Reproduction

The exact stage command and package versions are in training_config.json.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Model Details

Model Provider

armand0e

Model Tree

Base

Qwen/Qwen3.5-2B

Adapter

this model

Input Modalities

Text

Image

Video

Output Modalities

Text

Supported Functionality

Dedicated Endpoints

Explore FriendliAI today

Get started Talk to an engineer

Run Status

Status: complete
Adapter present: True
Latest checkpoint: outputs/qwen-capability-light/stage1-behavior-seed-sft/checkpoint-160
Best checkpoint: outputs/qwen-capability-light/stage1-behavior-seed-sft/checkpoint-160
Best eval loss: 2.6397743225097656
Trainer state: outputs/qwen-capability-light/stage1-behavior-seed-sft/trainer_state.json
Global step: 160
First Loss: 1.4097025394439697
Final Loss: 1.2960798740386963
Min Loss: 0.6323761343955994
Max Loss: 1.7843854427337646
Loss Points: 160
First Eval Loss: 2.7349205017089844
Final Eval Loss: 2.6397743225097656
Min Eval Loss: 2.6397743225097656
Max Eval Loss: 2.7349205017089844
Eval Loss Points: 9
Best Eval Loss: 2.6397743225097656
Best Global Step: 160
Train Runtime S: 3024.1198

Generated files:

training_config.json
stage_report.json
loss_history.csv
loss_curve.svg
eval_loss_history.csv
eval_loss_curve.svg

Loss curve

Eval loss curve

Context

Purpose: Behavior and reasoning seed: full-trace format/tool habits with a stronger reasoning-rich sample.
Previous adapter: none; stage 1 initializes the LoRA
Next stage: stage2-capability-step-sft
Base model: Qwen/Qwen3.5-2B
Data file: data/assembled/sft_qwen_messages_behavior_seed_light.jsonl
Eval file: data/eval/eval_sft_with_retention.jsonl
LoRA r/alpha/dropout: 16 / 16 / 0.0
Learning rate: 2e-06

Upstream Data

armand0e/qwen3.7-max-pi-traces
armand0e/badlogicgames-pi-mono-opus-filtered
armand0e/gpt-5.5-agent
armand0e/gpt-5.5-chat
TeichAI/claude-4.5-opus-high-reasoning-250x
TeichAI/Claude-Opus-4.6-Reasoning-887x

Compact Local Sample

json
{
  "messages": [
    {
      "content": "You are a helpful assistant in chat mode. When the user asks for code or an artifact, return the artifact or a concrete patch in the message. Do not say you will edit files unless a file-editing tool is actually available.",
      "role": "system"
    },
    {
      "content": "make me a polished SaaS landing page showcasing LaunchKit",
      "role": "user"
    },
    {
      "content": "'''html\n<section><h1>LaunchKit</h1><p>We make things better.</p><button>Get started</button></section>\n'''",
      "reasoning_content": "The user wants a single snippet, so return a simple complete code block.",
      "role": "assistant"
    },
    {
      "content": "this is too generic, make it feel like a real product",
      "role": "user"
    },
    {
      "content": "'''html\n<!doctype html>\n<html lang=\"en\">\n<meta charset=\"utf-8\">\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">\n<title>LaunchKit</title>\n\n  body{margin:0;font-family:Inter,ui-sans-serif,system-ui;background:#f7f8fb;color:#172033}\n  main{min-height:100vh;display:grid;grid-template-columns:1.05fr .95fr;gap:48px;align-items:center;padding:56px;m
...[truncated]

Reproduction

The exact stage command and package versions are in training_config.json.

qwen3.5-capability-light-v2-behavior-seed-lora

README

Run Status

Context

Upstream Data

Compact Local Sample

Reproduction

Explore FriendliAI today

README

Run Status

Context

Upstream Data

Compact Local Sample

Reproduction