pranavthombare

qwen3.5-0.8b-drivelm-lora-lr5e4

README

License: apache-2.0

TL;DR

ROUGE-L 0.540 — essentially a tie with the lr=2e-4 baseline (0.541), worse than lr=1e-4 (0.581).
Behavior collapsed harder than the lr=2e-4 baseline — ROUGE-L 0.022 vs 0.036.
Final epoch-average training loss 0.601 (vs 0.442 for 2e-4 and 0.417 for 1e-4) — the higher LR couldn't converge as cleanly within one epoch.

Eval results (3,770-sample DriveLM front-arc, vLLM)

Table with columns: Metric, Baseline, This adapter (lr=5e-4)
Metric	Baseline	This adapter (lr=5e-4)
ROUGE-1	0.166	0.547
ROUGE-L	0.157	0.540
Token-F1	0.117	0.497
Exact match	0.4%	35.8%
Mean per-request latency	1,420 ms	1,840 ms

Per question category (ROUGE-L)

Table with columns: Category, N, Baseline, This adapter
Category	N	Baseline	This adapter
perception	1,738	0.217	0.513
prediction	1,181	0.097	0.617
planning	813	0.107	0.509
behavior	38	0.305	0.022

Training Details

Identical to the lr=1e-4 sibling except:

Table with columns: Knob, Value
Knob	Value
Learning rate	5e-4
Final epoch-avg loss	0.601
Training wall clock	~20 minutes

Same base model, same LoRA r/α, same natural-1024 data, same camera mode, same epochs, same label-masking.

Why publish this as an ablation

The rubric for the assignment this was built for asks "Are your choices intentional, or default?" The honest answer for LR: the PEFT default (2e-4) was wrong for this task. Publishing all three sweep points quantifies the answer with measured numbers.

If you're tuning a similar VLM-LoRA on a small driving-QA dataset, this artifact is evidence that going up on the learning rate doesn't help and may hurt rare-class generalization even more than the default.

Position in the ablation series

Table with columns: Config, Sampling, lr, Epochs, Overall RL, Behavior RL
Config	Sampling	lr	Epochs	Overall RL	Behavior RL
nat-1024 (canonical sibling)	natural	2e-4	1	0.541	0.036
lr1e4 (recommended)	natural	1e-4	1	0.581	0.877
lr5e4 (this adapter)

Limitations

Same as the series (train/eval overlap, no referent-token grounding, no CAN-bus, nuScenes-mini scope) plus: this adapter is not recommended for use — it's published for the ablation record.

License

Apache-2.0.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Model Details

Model Provider

pranavthombare

Model Tree

Base

Qwen/Qwen3.5-0.8B

Adapter

this model

Input Modalities

Text

Image

Video

Output Modalities