Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
Run Status
- Status:
complete - Adapter present:
True - Latest checkpoint:
outputs/qwen-capability-light/stage1-behavior-seed-sft/checkpoint-160 - Best checkpoint:
outputs/qwen-capability-light/stage1-behavior-seed-sft/checkpoint-160 - Best eval loss:
2.6397743225097656 - Trainer state:
outputs/qwen-capability-light/stage1-behavior-seed-sft/trainer_state.json - Global step:
160 - First Loss:
1.4097025394439697 - Final Loss:
1.2960798740386963 - Min Loss:
0.6323761343955994 - Max Loss:
1.7843854427337646 - Loss Points:
160 - First Eval Loss:
2.7349205017089844 - Final Eval Loss:
2.6397743225097656 - Min Eval Loss:
2.6397743225097656 - Max Eval Loss:
2.7349205017089844 - Eval Loss Points:
9 - Best Eval Loss:
2.6397743225097656 - Best Global Step:
160 - Train Runtime S:
3024.1198
Generated files:
training_config.jsonstage_report.jsonloss_history.csvloss_curve.svgeval_loss_history.csveval_loss_curve.svg
Context
- Purpose: Behavior and reasoning seed: full-trace format/tool habits with a stronger reasoning-rich sample.
- Previous adapter:
none; stage 1 initializes the LoRA - Next stage:
stage2-capability-step-sft - Base model:
Qwen/Qwen3.5-2B - Data file:
data/assembled/sft_qwen_messages_behavior_seed_light.jsonl - Eval file:
data/eval/eval_sft_with_retention.jsonl - LoRA r/alpha/dropout:
16/16/0.0 - Learning rate:
2e-06 - Epochs:
1.0 - Merged 16-bit model:
not configured for this stage
Upstream Data
armand0e/qwen3.7-max-pi-tracesarmand0e/badlogicgames-pi-mono-opus-filteredarmand0e/gpt-5.5-agentarmand0e/gpt-5.5-chatTeichAI/claude-4.5-opus-high-reasoning-250xTeichAI/Claude-Opus-4.6-Reasoning-887x
Compact Local Sample
json
{"messages": [{"content": "You are a helpful assistant in chat mode. When the user asks for code or an artifact, return the artifact or a concrete patch in the message. Do not say you will edit files unless a file-editing tool is actually available.","role": "system"},{"content": "make me a polished SaaS landing page showcasing LaunchKit","role": "user"},{"content": "'''html\n<section><h1>LaunchKit</h1><p>We make things better.</p><button>Get started</button></section>\n'''","reasoning_content": "The user wants a single snippet, so return a simple complete code block.","role": "assistant"},{"content": "this is too generic, make it feel like a real product","role": "user"},{"content": "'''html\n<!doctype html>\n<html lang=\"en\">\n<meta charset=\"utf-8\">\n<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">\n<title>LaunchKit</title>\n\n body{margin:0;font-family:Inter,ui-sans-serif,system-ui;background:#f7f8fb;color:#172033}\n main{min-height:100vh;display:grid;grid-template-columns:1.05fr .95fr;gap:48px;align-items:center;padding:56px;m...[truncated]
Reproduction
The exact stage command and package versions are in training_config.json.
Model provider
armand0e
Model tree
Base
Qwen/Qwen3.5-2B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information