Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Results: base vs v2 vs v3 (real-defect eval, n=32)
| Metric | Base | v2 | v3 |
|---|---|---|---|
| Verdict accuracy | ~72% | 78.1% | 78.1% |
| Positive recall | 87.5% (14/16) | 56.2% (9/16) | 62.5% (10/16) |
| Negative specificity | ~56% | 100% | 93.8% |
| Category match | 56.2% | 43.8% | 43.8% |
| Invalid JSON | 0/32 | 0/32 | 0/32 |
Training data
v2's 498 rows + 14 access/logic-bypass contrastive pairs (gen_bypass_pairs.py, Drupal-expert-verified) = 526 rows.
QLoRA r=16 on q/k/v/o, batch4+grad-ckpt, MAX_LEN=2048, 3 epochs, lr 2e-4. Full 3-way report with per-item detail
ships in the project repo under docs/eval/dcr-qlora-v3-report.md.
Limitations
Same as v2, plus: v3 traded v2's 100% specificity for one false positive without a real recall gain, so it is not a recommended upgrade over v2. Real-defect recall remains ~60%; the access-bypass class is still largely missed. Keep a human in the loop; the model is one component of a hybrid pipeline (static analyzers + RAG + this adapter).
Model provider
bartek-flp
Model tree
Base
Qwen/Qwen3-Coder-30B-A3B-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information