pranavthombare

qwen3.5-0.8b-drivelm-lora-lr1e4

README

License: apache-2.0

Highlights

ROUGE-L 0.581 on 3,770 DriveLM samples — 3.7× the zero-shot baseline (0.157).
Behavior recovery without a data fix. At lr=2e-4 the behavior category collapsed to ROUGE-L 0.036 (terse mode collapse); at lr=1e-4 on the same natural-distribution data it scores 0.877 — almost matching the stratified-data run (0.911). The behavior collapse turned out to be an LR effect, not a data effect.
Adapter is 12.8 MB — same rank/alpha as the other variants.

Eval results (3,770-sample DriveLM front-arc, vLLM)

Table with columns: Metric, Baseline, This adapter (lr=1e-4), Δ
Metric	Baseline	This adapter (lr=1e-4)	Δ
ROUGE-1	0.166	0.591	+0.425
ROUGE-L	0.157	0.581	+0.424
Token-F1	0.117	0.544	+0.427
Exact match	0.4%	41.9%	+41.5 pp
Mean per-request latency	1,420 ms	2,098 ms	+678 ms

Per question category (ROUGE-L)

Table with columns: Category, N, Baseline, This adapter, Δ
Category	N	Baseline	This adapter	Δ
perception	1,738	0.217	0.533	+0.316
prediction	1,181	0.097	0.696	+0.599
planning	813	0.107	0.503	+0.396
behavior

The behavior win is the headline differentiator from the lr=2e-4 variant — see "Position in the ablation series" below.

Training Details

Table

Base model	`Qwen/Qwen3.5-0.8B`
Adapter type	QLoRA (NF4 4-bit base + LoRA r=8)
LoRA rank / alpha	8 / 16
Target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`

Position in the ablation series

Table with columns: Config, Sampling, lr, Epochs, Overall RL, Behavior RL
Config	Sampling	lr	Epochs	Overall RL	Behavior RL
nat-1024 (canonical sibling)	natural	2e-4	1	0.541	0.036 ⚠️
lr1e4 (this adapter)	natural	1e-4	1	0.581 ⭐	0.877 ⭐
lr5e4

Limitations

Train/eval overlap. Training set is a subset of the eval set.
No referent-token grounding (<c1,CAM_FRONT,x,y> ignored).
No CAN-bus signal access for behavior ego-velocity attributes.
nuScenes-mini scope — 38 frames, 6 scenes, daylight bias.
Latency — produces longer outputs than the lr=2e-4 sibling (+1 second mean latency).

Usage

python
from peft import PeftModel
from transformers import AutoProcessor, AutoModelForImageTextToText

base = AutoModelForImageTextToText.from_pretrained("Qwen/Qwen3.5-0.8B", trust_remote_code=True)
processor = AutoProcessor.from_pretrained("Qwen/Qwen3.5-0.8B", trust_remote_code=True)
model = PeftModel.from_pretrained(base, "pranavthombare/qwen3.5-0.8b-drivelm-lora-lr1e4").eval()

License

Apache-2.0.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Model Details

Model Provider

pranavthombare

Model Tree

Base

Qwen/Qwen3.5-0.8B

Adapter

this model

Input Modalities

Text

Image

Video

Output Modalities