Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model details
- Base (inference):
Qwen/Qwen2.5-1.5B-Instruct— RelayOps loads the adapter over this full-precision base. - Trained on:
unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit(Unsloth 4-bit QLoRA); the adapter loads on either base. - Method: Unsloth + LoRA/QLoRA (adapter only)
- Task: single-label intent classification, output
{"intent": "<label>"} - Labels:
reset_device,device_status,device_faq,billing,greeting,unknown - Dataset: 2,400 examples, 400 per intent, curated seeds + deterministic
template paraphrases, with
groupids so paraphrase families don't leak across splits.
Intended use & scope
- Input: one customer chat message. Output: one intent label as JSON.
- The model classifies intent only. It does not decide risk, route, permissions, billing, offers, or account access — those are enforced by RelayOps' deterministic access gate and router (policy stays out of model weights).
- Confidence is read from the model's own token probabilities at inference, not baked into labels.
Out-of-scope use
Do not use this model to make billing, payment, plan-change, access-control, offer, or customer-eligibility decisions. It predicts intent only; those decisions belong to RelayOps' deterministic access gate, router, and human escalation. It is not a general-purpose intent model — it is trained on six telecom intents over synthetic data.
Evaluation
| Split | Accuracy | Macro-F1 |
|---|---|---|
| Held-out (seed-13, group-aware, 726 ex) | 0.999 | 0.999 |
| Hand-written adversarial / paraphrase (24 ex) | 0.958 | 0.804 |
Baselines on the same sets: keyword 0.506 / 0.250 acc; Complement NB 0.933 / 0.667 acc.
Honest caveat. The held-out set is template-generated synthetic data, so high in-distribution scores are expected even with anti-leakage splits. Treat the held-out number as routing-slice validation, not a production benchmark; the adversarial set is the truer generalization signal, and the adversarial macro-F1 (0.804 < 0.958 accuracy) shows the model is still uneven on the hardest classes.
Limitations
- Trained on synthetic telecom data for six intents; not a general intent model.
- Out-of-taxonomy / mixed-intent / abusive messages map to
unknown, which RelayOps escalates — the model does not resolve them. - Adversarial set is small (24); per-class adversarial recall and a larger set are follow-ups.
How to use (in RelayOps)
bash
RELAYOPS_INTENT_MODEL=<this-repo-or-local-adapter-dir> \python -m src.eval.run_intent_eval
or in code:
python
from src.router.registry import get_classifierclf = get_classifier("finetuned") # reads RELAYOPS_INTENT_MODELclf.classify("my internet is down") # -> Classification(intent=reset_device, ...)
Reproduce
Training recipe: src/router/finetune_train.py (Unsloth LoRA). Data export:
src/eval/export_finetune_data.py. Colab notebook:
notebooks/finetune_intent_colab.ipynb.
Model provider
venkatamanideep
Model tree
Base
Qwen/Qwen2.5-1.5B-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information