Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

Run Status

  • Status: complete_skipped
  • Adapter present: True
  • Latest checkpoint: outputs/qwen-capability-light/stage2-capability-step-sft/checkpoint-260
  • Best checkpoint: outputs/qwen-capability-light/stage2-capability-step-sft/checkpoint-260
  • Best eval loss: 2.5278091430664062
  • Trainer state: outputs/qwen-capability-light/stage2-capability-step-sft/trainer_state.json
  • Global step: 260
  • First Loss: 1.5216938257217407
  • Final Loss: 1.655211091041565
  • Min Loss: 0.6422638297080994
  • Max Loss: 2.0936222076416016
  • Loss Points: 260
  • First Eval Loss: 2.6287848949432373
  • Final Eval Loss: 2.5278091430664062
  • Min Eval Loss: 2.5278091430664062
  • Max Eval Loss: 2.6287848949432373
  • Eval Loss Points: 14
  • Best Eval Loss: 2.5278091430664062
  • Best Global Step: 260
  • Train Runtime S: 3794.8465

Generated files:

  • training_config.json
  • stage_report.json
  • loss_history.csv
  • loss_curve.svg
  • eval_loss_history.csv
  • eval_loss_curve.svg

Loss curve

Eval loss curve

Context

  • Purpose: Capability next-action SFT on coding-agent decisions plus verifier-style gap repairs.
  • Previous adapter: armand0e/qwen3.5-capability-light-behavior-seed-lora
  • Next stage: stage3-capability-dpo
  • Base model: Qwen/Qwen3.5-2B
  • Data file: data/assembled/sft_qwen_next_actions_capability_light.jsonl
  • Eval file: data/eval/eval_next_actions_with_retention.jsonl
  • LoRA r/alpha/dropout: 16 / 16 / 0.0
  • Learning rate: 8e-07
  • Epochs: 1.0
  • Merged 16-bit model: armand0e/qwen3.5-capability-light-capability-step-merged-16bit

Upstream Data

  • armand0e/qwen3.7-max-pi-traces
  • armand0e/badlogicgames-pi-mono-opus-filtered
  • armand0e/gpt-5.5-agent
  • armand0e/gpt-5.5-chat
  • TeichAI/claude-4.5-opus-high-reasoning-250x
  • TeichAI/Claude-Opus-4.6-Reasoning-887x

Compact Local Sample

json

{
"messages": [
{
"content": "User/task context:\nuser: Give only the answer: (98 + 459) - 34 = ?",
"role": "user"
},
{
"content": "523",
"reasoning_content": "Expression evaluates to 523. The answer should return only the numeric answer, while making sure no extra words.",
"role": "assistant"
}
],
"metadata": {
"expected": 523,
"family": "arithmetic",
"key": "gap/arithmetic/00000",
"source": "gap_capability_pack"
},
"source": "gap_capability_pack"
}

Reproduction

The exact stage command and package versions are in training_config.json.

Model provider

armand0e

Model tree

Base

Qwen/Qwen3.5-2B

Adapter

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today