Lance1573

acrouter-qwen35-08b-router-lora

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Files

  • adapter_model.safetensors: LoRA adapter weights.
  • adapter_config.json: PEFT adapter configuration with base model set to Qwen/Qwen3.5-0.8B.
  • tokenizer.json, tokenizer_config.json, chat_template.jinja: tokenizer assets copied from the training export.
  • training_config.json: compact training hyperparameters.
  • eval_metrics.json: ID test metrics for this router.

Training Summary

  • Base model: Qwen/Qwen3.5-0.8B
  • PEFT type: LORA
  • LoRA rank: 16
  • LoRA alpha: 32
  • LoRA dropout: 0.05
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Epochs: 3
  • Learning rate: 0.0002
  • Max sequence length: 1024
  • Training samples: 4483
  • Backend model choices: claude-sonnet-4-6, claude-opus-4-6, kimi-k2.5, gpt-5.4, MiniMax-M2.7, qwen3.5-plus, glm-5, Qwen3-Max

Evaluation

Evaluated on the CodeRouterBench ID test split (n=2919):

Table
metricvalue
Avg performance0.474415
Oracle performance0.570049
Oracle gap0.095634
Routing accuracy0.361425
rAcc0.424460
Strong model call rate0.373073
Perf/cost ratio364.336998

Usage

python

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "Qwen/Qwen3.5-0.8B"
adapter_id = "Lance1573/acrouter-qwen35-08b-router-lora"
tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype="auto",
device_map="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(base, adapter_id)
model.eval()

The adapter is intended for model-routing prompts from Agent-as-a-Router rather than general-purpose instruction following.

Limitations

This is a task-specific router trained for selecting among the backend models listed above. It should not be interpreted as a general coding assistant. The adapter does not include private API keys, raw trajectories, optimizer states, or training checkpoints.

Citation

bibtex

@article{agent2026zhou,
title = {Agent-as-a-Router: Agentic Model Routing for Coding Tasks},
author = {Pengfei Zhou, Zhiwei Tang, Yixing Ma, Jiasheng Tang, Yizeng Han, Zhenglin Wan, Fanqing Meng, Wei Wang, Bohan Zhuang, Wangbo Zhao, Yang You},
journal = {arXiv preprint arXiv:2606.22902},
year = {2026},
archivePrefix = {arXiv},
eprint = {2606.22902},
url = {https://arxiv.org/abs/2606.22902},
}

Model provider

Lance1573

Model tree

Base

Qwen/Qwen3.5-0.8B

Adapter

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today