Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

Run Status

  • Status: complete_skipped
  • Adapter present: True
  • Latest checkpoint: outputs/qwen-capability-light/stage1-behavior-seed-sft/checkpoint-80
  • Best checkpoint: outputs/qwen-capability-light/stage1-behavior-seed-sft/checkpoint-80
  • Best eval loss: 2.720693588256836
  • Trainer state: outputs/qwen-capability-light/stage1-behavior-seed-sft/trainer_state.json
  • Global step: 80
  • First Loss: 1.0619739294052124
  • Final Loss: 1.046101450920105
  • Min Loss: 0.5612894296646118
  • Max Loss: 1.6812756061553955
  • Loss Points: 80
  • First Eval Loss: 2.7349205017089844
  • Final Eval Loss: 2.720693588256836
  • Min Eval Loss: 2.720693588256836
  • Max Eval Loss: 2.7349205017089844
  • Eval Loss Points: 5
  • Best Eval Loss: 2.720693588256836
  • Best Global Step: 80
  • Train Runtime S: 1337.8865

Generated files:

  • training_config.json
  • stage_report.json
  • loss_history.csv
  • loss_curve.svg
  • eval_loss_history.csv
  • eval_loss_curve.svg

Loss curve

Eval loss curve

Context

  • Purpose: Light behavior seed: enough trace format/tool habits without drilling a full behavior path.
  • Previous adapter: none; stage 1 initializes the LoRA
  • Next stage: stage2-capability-step-sft
  • Base model: Qwen/Qwen3.5-2B
  • Data file: data/assembled/sft_qwen_messages_behavior_seed_light.jsonl
  • Eval file: data/eval/eval_sft_with_retention.jsonl
  • LoRA r/alpha/dropout: 16 / 16 / 0.0
  • Learning rate: 8e-07
  • Epochs: 1.0
  • Merged 16-bit model: not configured for this stage

Upstream Data

  • armand0e/qwen3.7-max-pi-traces
  • armand0e/badlogicgames-pi-mono-opus-filtered
  • armand0e/gpt-5.5-agent
  • armand0e/gpt-5.5-chat
  • TeichAI/claude-4.5-opus-high-reasoning-250x
  • TeichAI/Claude-Opus-4.6-Reasoning-887x

Compact Local Sample

json

{
"messages": [
{
"content": "You are a helpful assistant in chat mode. When the user asks for code or an artifact, return the artifact or a concrete patch in the message. Do not say you will edit files unless a file-editing tool is actually available.",
"role": "system"
},
{
"content": "No file tools are available here. Fix this snippet and return the corrected code block only, plus one short note:\n'''python\ndef normalize_tags(tags):\n return [t.lower() for t in tags.split(',')]\n'''",
"role": "user"
},
{
"content": "'''python\ndef normalize_tags(tags):\n if not tags:\n return []\n return [part.strip().lower() for part in tags.split(',') if part.strip()]\n'''\n\nNote: The function now handles None, trims whitespace, and drops empty tags.",
"reasoning_content": "The user wants a corrected snippet, but no edit tool is available. Provide the code directly and avoid claiming a file was changed.",
"role": "assistant"
}
],
"metadata": {
"failure_type": "claims_edit_without_tool",
"family": "no_tool_code_edit",
"has_system": true,
"source": "chat_repair_pack",
"source_key": "chat
...[truncated]

Reproduction

The exact stage command and package versions are in training_config.json.

Model provider

armand0e

Model tree

Base

Qwen/Qwen3.5-2B

Adapter

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today