Lance1573
acrouter-qwen35-08b-router-lora
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Files
adapter_model.safetensors: LoRA adapter weights.adapter_config.json: PEFT adapter configuration with base model set toQwen/Qwen3.5-0.8B.tokenizer.json,tokenizer_config.json,chat_template.jinja: tokenizer assets copied from the training export.training_config.json: compact training hyperparameters.eval_metrics.json: ID test metrics for this router.
Training Summary
- Base model:
Qwen/Qwen3.5-0.8B - PEFT type:
LORA - LoRA rank:
16 - LoRA alpha:
32 - LoRA dropout:
0.05 - Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Epochs:
3 - Learning rate:
0.0002 - Max sequence length:
1024 - Training samples:
4483 - Backend model choices: claude-sonnet-4-6, claude-opus-4-6, kimi-k2.5, gpt-5.4, MiniMax-M2.7, qwen3.5-plus, glm-5, Qwen3-Max
Evaluation
Evaluated on the CodeRouterBench ID test split (n=2919):
| metric | value |
|---|---|
| Avg performance | 0.474415 |
| Oracle performance | 0.570049 |
| Oracle gap | 0.095634 |
| Routing accuracy | 0.361425 |
| rAcc | 0.424460 |
| Strong model call rate | 0.373073 |
| Perf/cost ratio | 364.336998 |
Usage
python
from transformers import AutoModelForCausalLM, AutoTokenizerfrom peft import PeftModelbase_model = "Qwen/Qwen3.5-0.8B"adapter_id = "Lance1573/acrouter-qwen35-08b-router-lora"tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)base = AutoModelForCausalLM.from_pretrained(base_model,torch_dtype="auto",device_map="auto",trust_remote_code=True,)model = PeftModel.from_pretrained(base, adapter_id)model.eval()
The adapter is intended for model-routing prompts from Agent-as-a-Router rather than general-purpose instruction following.
Limitations
This is a task-specific router trained for selecting among the backend models listed above. It should not be interpreted as a general coding assistant. The adapter does not include private API keys, raw trajectories, optimizer states, or training checkpoints.
Citation
bibtex
@article{agent2026zhou,title = {Agent-as-a-Router: Agentic Model Routing for Coding Tasks},author = {Pengfei Zhou, Zhiwei Tang, Yixing Ma, Jiasheng Tang, Yizeng Han, Zhenglin Wan, Fanqing Meng, Wei Wang, Bohan Zhuang, Wangbo Zhao, Yang You},journal = {arXiv preprint arXiv:2606.22902},year = {2026},archivePrefix = {arXiv},eprint = {2606.22902},url = {https://arxiv.org/abs/2606.22902},}
Model provider
Lance1573
Model tree
Base
Qwen/Qwen3.5-0.8B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information