Use case
Reflections streams audio + video from MentraOS smart glasses, transcribes with Soniox, attributes speakers with an active-speaker-detection model, then asks this classifier whether the latest finalized sentence is actionable. When P(actionable) >= GLASSES_GATE_THRESHOLD (default 0.25), the pipeline escalates to Claude Haiku with tools (web search, Google Maps, Google Calendar). Otherwise the turn is dropped silently.
The classifier is not a general chat model. It is trained to output a single label (0 or 1) given five structured context inputs:
- Transcript — recent speaker-attributed turns, with the target sentence marked.
- Memory — short summary of prior sessions (read from
memory.md).
- Available tools — names of tools the agent could call this turn (e.g.
send_message, create_calendar_event).
- Location — a coarse description + lat/lon (used by maps-style tools).
- Entity list — known people in the wearer's life, with facts (allergies, preferences, etc.).
Prompts are rendered into Qwen's ChatML format and the score is softmax(logits)[1] over the two-token vocabulary {0, 1} at the <label> position.
How to use
The adapter is intended to be loaded onto the Qwen3-1.7B base model with PEFT:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
BASE = "Qwen/Qwen3-1.7B"
ADAPTER = "rushilsaraf/qwen3-actionable-v2-adapter"
tokenizer = AutoTokenizer.from_pretrained(BASE)
base = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype=torch.float16)
model = PeftModel.from_pretrained(base, ADAPTER)
model.eval()
For end-to-end use, install Reflections and run python -m apps.viewer — the LoRA loads automatically from this Hub repo (override with REFLECTIONS_LORA_MODEL_ID).
Training
Table | |
|---|
| Training base | unsloth/qwen3-1.7b-unsloth-bnb-4bit (Unsloth 4-bit) |
| Inference base | Qwen/Qwen3-1.7B (float16) |
| Framework | Unsloth + TRL SFT |
| Hardware | Single T4 (free Colab tier) |
| Wall-clock | ~50 minutes |
| Examples | ~400 (synthetic, labeled) |
Adapter config
Table with columns: Parameter, Value| Parameter | Value |
|---|
| PEFT type | LoRA |
Rank (r) | 8 |
| LoRA alpha | 16 |
| LoRA dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, |
Gate thresholds (Reflections-side)
Two distinct knobs, do not conflate:
GLASSES_GATE_THRESHOLD (default 0.25) — the live gate used by the agent worker. Sentences scoring below this are silently dropped.
REASONING_TRIGGER (0.45) — only used by the offline smoke-test path (scripts/smoke_full_transcript.py, scripts/smoke_server.py) to decide whether to also generate a reasoning trace.
The live path never reads REASONING_TRIGGER.
Table with columns: Metric, Value| Metric | Value |
|---|
| Test accuracy | 88.6% |
| Train accuracy | 94.3% |
| Train–test gap | +5.7% |
| Raw Qwen3-1.7B (no LoRA) | 51.1% |
| Lift from LoRA | +37.5 points |
| Mean inference latency (Apple Silicon MPS) | ~196 ms |
| p95 latency | ~257 ms |
| Throughput | ~5 classifications / sec |
The benchmark is a synthetic dataset matched to the training distribution. Real-world ASR transcripts are not yet part of the evaluation set — see Limitations below.
Limitations
- English only.
- Synthetic training data ceiling. The 400-example training set was generated to cover entity / memory / tool / location signals. Real-world ASR disfluencies are not represented.
- Weak categories. Per-category breakdowns show
tool_dependent at ~40% and location_dependent at ~60% accuracy. Adding ~25 paired-negative examples per category should fix the imbalance in the next training cycle.
- Not a general classifier. The model expects the exact 5-input prompt structure produced by
packages/proactivity/render.py in Reflections. Out-of-distribution prompts will produce unreliable scores.
- Not safety-critical. Do not use for medical, legal, or moderation decisions. This is a latency-saving gate in front of a stronger downstream LLM, not a standalone judgment.
License
Combined use of base + adapter remains subject to the Apache 2.0 license of the Qwen3 weights.
Citation
If you use this adapter, please reference the Reflections project and the Qwen3 base model:
@misc{qwen3-2025,
title = {Qwen3 Technical Report},
author = {Qwen Team},
year = {2025},
url = {https://huggingface.co/Qwen/Qwen3-1.7B}
}