Pablo-Flores-Mollinedo

verilog-qwen3.5-9b-v32-migration-lora

README

License: apache-2.0

Important caveat

This is not a clean zero-shot leaderboard model. The training mix includes benchmark-targeted verified outputs and distillation anchors from earlier adapters/pipelines. Treat the scores below as experiment results, not contamination-free leaderboard claims.

LoRA weights were not transferred from Qwen2.5-Coder. This adapter was trained directly on Qwen3.5-9B using verified examples/behavior from v9/v30b/v29/v31.

Results

VerilogEval v2 direct, spec-to-RTL, n=1, temperature 0

Table with columns: Model / system, Compile, Functional pass
Model / system	Compile	Functional pass
v9 prior single adapter	—	67/156
v30b best Qwen2.5-Coder single adapter	141/156	71/156
v29 multi-adapter verifier selector	150/156	84/156
v32 Qwen3.5-9B migration	71/156	60/156

v32 underperformed as a single adapter, mainly because Qwen3.5 often produced long reasoning or malformed final code. However, it had 12 functional wins over v30b, making it useful as a diversity/teacher checkpoint.

Training data mix

Dataset builder: scripts/build_v32_qwen35_migration_dataset.py

Unique source counts:

67 v9 pass anchors.
71 v30b pass anchors.
17 v9-fail/v29-pass delta wins.
67 selector retention rows.
35 external/general rows.
382 clean verified rows.
316 synthetic verified rows.

Default repeat weights:

text
v9 pass anchor:       14x
v30b pass anchor:     14x
delta wins:           45x
selector retention:    4x
external general:     20x
clean retention:       3x
synthetic:             1x

Training used --drop-overlength; overlength rows were dropped, not truncated.

Training hyperparameters

text
base model: Qwen/Qwen3.5-9B
method: QLoRA/LoRA
LoRA r: 32
LoRA alpha: 64
learning rate: 1e-5
epochs: 0.80
max length: 1536
batch size: 1
grad accum: 4
warmup steps: 40

Usage

Qwen3.5 uses a conditional-generation loader in the current Transformers stack.

python
import torch
from transformers import AutoTokenizer, AutoModelForImageTextToText, BitsAndBytesConfig
from peft import PeftModel

base = "Qwen/Qwen3.5-9B"
adapter = "Pablo-Flores-Mollinedo/verilog-qwen3.5-9b-v32-migration-lora"

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

tok = AutoTokenizer.from_pretrained(adapter, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
    base,
    quantization_config=bnb,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()

prompt = "Write module TopModule(input a, input b, output out); out should be a & b."
messages = [
    {"role": "system", "content": "You are a Verilog RTL designer. Return synthesizable Verilog."},
    {"role": "user", "content": prompt},
]
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(text, return_tensors="pt").to(model.device)
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=1024, do_sample=False, pad_token_id=tok.eos_token_id)
print(tok.decode(out[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

v33 Qwen3.5 thinking-reinforced LoRA: stronger follow-up, 76/156 VerilogEval pass.
v30b Qwen2.5-Coder LoRA: prior best single adapter, 71/156 VerilogEval pass.
v29 verifier selector: best practical pipeline, 84/156 pass.

Intended use

Research and experimentation with Verilog RTL generation. Always compile, simulate, lint, and review generated RTL before use.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Model Details

Model Provider

Pablo-Flores-Mollinedo

Model Tree

Base

Qwen/Qwen3.5-9B

Adapter

this model

Input Modalities

Text

Image

Video

Output Modalities

Text

Supported Functionality

Dedicated Endpoints

Explore FriendliAI today

Get started Talk to an engineer

README

License: apache-2.0

Important caveat

LoRA weights were not transferred from Qwen2.5-Coder. This adapter was trained directly on Qwen3.5-9B using verified examples/behavior from v9/v30b/v29/v31.

Results

VerilogEval v2 direct, spec-to-RTL, n=1, temperature 0

Table with columns: Model / system, Compile, Functional pass
Model / system	Compile	Functional pass
v9 prior single adapter	—	67/156
v30b best Qwen2.5-Coder single adapter	141/156	71/156
v29 multi-adapter verifier selector	150/156	84/156
v32 Qwen3.5-9B migration	71/156	60/156

Training data mix

Dataset builder: scripts/build_v32_qwen35_migration_dataset.py

Unique source counts:

67 v9 pass anchors.
71 v30b pass anchors.
17 v9-fail/v29-pass delta wins.
67 selector retention rows.
35 external/general rows.
382 clean verified rows.
316 synthetic verified rows.

Default repeat weights:

text
v9 pass anchor:       14x
v30b pass anchor:     14x
delta wins:           45x
selector retention:    4x
external general:     20x
clean retention:       3x
synthetic:             1x

Training used --drop-overlength; overlength rows were dropped, not truncated.

Training hyperparameters

text
base model: Qwen/Qwen3.5-9B
method: QLoRA/LoRA
LoRA r: 32
LoRA alpha: 64
learning rate: 1e-5
epochs: 0.80
max length: 1536
batch size: 1
grad accum: 4
warmup steps: 40

Usage

Qwen3.5 uses a conditional-generation loader in the current Transformers stack.

python
import torch
from transformers import AutoTokenizer, AutoModelForImageTextToText, BitsAndBytesConfig
from peft import PeftModel

base = "Qwen/Qwen3.5-9B"
adapter = "Pablo-Flores-Mollinedo/verilog-qwen3.5-9b-v32-migration-lora"

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

tok = AutoTokenizer.from_pretrained(adapter, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
    base,
    quantization_config=bnb,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()

prompt = "Write module TopModule(input a, input b, output out); out should be a & b."
messages = [
    {"role": "system", "content": "You are a Verilog RTL designer. Return synthesizable Verilog."},
    {"role": "user", "content": prompt},
]
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(text, return_tensors="pt").to(model.device)
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=1024, do_sample=False, pad_token_id=tok.eos_token_id)
print(tok.decode(out[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

v33 Qwen3.5 thinking-reinforced LoRA: stronger follow-up, 76/156 VerilogEval pass.
v30b Qwen2.5-Coder LoRA: prior best single adapter, 71/156 VerilogEval pass.
v29 verifier selector: best practical pipeline, 84/156 pass.

Intended use

Research and experimentation with Verilog RTL generation. Always compile, simulate, lint, and review generated RTL before use.

verilog-qwen3.5-9b-v32-migration-lora

README

Important caveat

Results

VerilogEval v2 direct, spec-to-RTL, n=1, temperature 0

Training data mix

Training hyperparameters

Usage

Related artifacts

Intended use

Explore FriendliAI today

README

Important caveat

Results

VerilogEval v2 direct, spec-to-RTL, n=1, temperature 0

Training data mix

Training hyperparameters

Usage

Related artifacts

Intended use