Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Why

The base model under-reports issues on real Drupal merge requests (high precision, low recall). This adapter is trained on a hybrid distillation set so the model both catches real Drupal anti-patterns (synthetic positives) and stays quiet on clean code (real merged-MR negatives).

Training data

400 teacher-labeled pairs (distillation_v1): 251 positive / 149 negative.

  • Positives — 243 synthetic across 26 Drupal anti-patterns (SQLi, XSS sinks, CSRF-on-GET, broken DI / missing create(), accessCheck() omissions, recursion in presave, deprecated APIs, etc.) + 7 real merge-request bugs.
  • Negatives — 149 clean, merged MRs from webform, paragraphs, drupal core, pathauto, commerce, search_api (teacher-verified clean). Teacher: Claude Opus 4.x. Each pair is (diff → JSON verdict + findings).

Usage (with the base model)

python

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = "Qwen/Qwen3-Coder-30B-A3B-Instruct"
tok = AutoTokenizer.from_pretrained(base)
m = AutoModelForCausalLM.from_pretrained(base, device_map="auto", torch_dtype="bfloat16")
m = PeftModel.from_pretrained(m, "bartek-flp/qwen3coder-30b-dcr-lora")

Prompt with the DCR system message (review a diff, output JSON findings only).

Results (A/B vs base, held-out val)

48 held-out pairs the adapter never saw (27 with a defect, 21 clean), temperature 0, served as the same base weights with the LoRA hot-swapped, so only the training differs.

Metric (n=48)BaseTuned
Verdict accuracy83.3% (40/48)95.8% (46/48)
Positive recall81.5% (22/27)92.6% (25/27)
Negative specificity85.7% (18/21)100% (21/21)
Category match40.7% (11/27)63.0% (17/27)
Invalid JSON4/480/48

Honest read: fine-tuning mostly bought reliability and calibration, not raw bug-finding. The base model already detects most issues, but on 4 positives it emitted unparseable JSON (often a stray \Drupal\ backslash) and on 3 clean diffs it raised false alarms. The adapter always returns valid JSON, holds 100% specificity, and names categories better. The cost: it missed two low-severity O(n²) array_merge-in-loop bugs the base model caught. A full report with verbatim side-by-side outputs, covering both the wins and the losses, ships in the project repo under docs/eval/.

Limitations

QLoRA on attention projections only; tuned for diff review, not general chat. The synthetic positives teach patterns, not every real-world manifestation. Always keep a human in the loop for security findings.

Model provider

bartek-flp

Model tree

Base

Qwen/Qwen3-Coder-30B-A3B-Instruct

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today