SwarmandBee
DiabeticDaily-9B
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Beat-base — proven
Held-out perplexity vs base Qwen3.5-9B (text never trained on):
| held-out loss | perplexity | |
|---|---|---|
| Base Qwen3.5-9B | 1.3625 | 3.906 |
| DiabeticDaily-9B | 0.8079 | 2.243 |
| Δ | −0.555 (+40.7% better) |
Verdict: BEAT BASE ✅. Models the domain ~41% better than base — and its perplexity (2.24) is nearly the 27B anchor's (2.05): the knowledge survives the shrink. That's the distillation-ladder thesis, proven.
How it was cooked
- Base: Qwen/Qwen3.5-9B (Apache-2.0). Data: the same deeded OpenDiabetic corpus as the 27B anchor.
- Recipe: LoRA r64/α32 on attn+mlp, LR 1e-5, cosine, early-stop overcook guard. Merged bf16.
The ladder: 🐝 27B anchor (+57%) → 🏠 9B home (+40.7%) → 🛏️ 4B edge (+40.4%)
⚠️ Not medical advice — diabetic lifestyle/education/organization only. Not a diagnosis. Emergencies → 911.
© 2026 Swarm and Bee LLC · opendiabetic.com · Apache-2.0 · We slow cook the truth. 🐝
Model provider
SwarmandBee
Model tree
Base
Qwen/Qwen3.5-9B
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information