Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Important caveat
This is not a clean zero-shot leaderboard model. The training mix includes benchmark-targeted verified outputs and distillation anchors from earlier adapters/pipelines. Treat the scores below as experiment results, not contamination-free leaderboard claims.
LoRA weights were not transferred from Qwen2.5-Coder. This adapter was trained directly on Qwen3.5-9B using verified examples/behavior from v9/v30b/v29/v31.
Results
VerilogEval v2 direct, spec-to-RTL, n=1, temperature 0
| Model / system | Compile | Functional pass |
|---|---|---|
| v9 prior single adapter | — | 67/156 |
| v30b best Qwen2.5-Coder single adapter | 141/156 | 71/156 |
| v29 multi-adapter verifier selector | 150/156 | 84/156 |
| v32 Qwen3.5-9B migration | 71/156 | 60/156 |
v32 underperformed as a single adapter, mainly because Qwen3.5 often produced long reasoning or malformed final code. However, it had 12 functional wins over v30b, making it useful as a diversity/teacher checkpoint.
Training data mix
Dataset builder: scripts/build_v32_qwen35_migration_dataset.py
Unique source counts:
- 67 v9 pass anchors.
- 71 v30b pass anchors.
- 17 v9-fail/v29-pass delta wins.
- 67 selector retention rows.
- 35 external/general rows.
- 382 clean verified rows.
- 316 synthetic verified rows.
Default repeat weights:
text
v9 pass anchor: 14xv30b pass anchor: 14xdelta wins: 45xselector retention: 4xexternal general: 20xclean retention: 3xsynthetic: 1x
Training used --drop-overlength; overlength rows were dropped, not truncated.
Training hyperparameters
text
base model: Qwen/Qwen3.5-9Bmethod: QLoRA/LoRALoRA r: 32LoRA alpha: 64learning rate: 1e-5epochs: 0.80max length: 1536batch size: 1grad accum: 4warmup steps: 40
Usage
Qwen3.5 uses a conditional-generation loader in the current Transformers stack.
python
import torchfrom transformers import AutoTokenizer, AutoModelForImageTextToText, BitsAndBytesConfigfrom peft import PeftModelbase = "Qwen/Qwen3.5-9B"adapter = "Pablo-Flores-Mollinedo/verilog-qwen3.5-9b-v32-migration-lora"bnb = BitsAndBytesConfig(load_in_4bit=True,bnb_4bit_quant_type="nf4",bnb_4bit_compute_dtype=torch.bfloat16,bnb_4bit_use_double_quant=True,)tok = AutoTokenizer.from_pretrained(adapter, trust_remote_code=True)model = AutoModelForImageTextToText.from_pretrained(base,quantization_config=bnb,device_map="auto",trust_remote_code=True,)model = PeftModel.from_pretrained(model, adapter)model.eval()prompt = "Write module TopModule(input a, input b, output out); out should be a & b."messages = [{"role": "system", "content": "You are a Verilog RTL designer. Return synthesizable Verilog."},{"role": "user", "content": prompt},]text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)inputs = tok(text, return_tensors="pt").to(model.device)with torch.no_grad():out = model.generate(**inputs, max_new_tokens=1024, do_sample=False, pad_token_id=tok.eos_token_id)print(tok.decode(out[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
Related artifacts
- v33 Qwen3.5 thinking-reinforced LoRA: stronger follow-up, 76/156 VerilogEval pass.
- v30b Qwen2.5-Coder LoRA: prior best single adapter, 71/156 VerilogEval pass.
- v29 verifier selector: best practical pipeline, 84/156 pass.
Intended use
Research and experimentation with Verilog RTL generation. Always compile, simulate, lint, and review generated RTL before use.
Model provider
Pablo-Flores-Mollinedo
Model tree
Base
Qwen/Qwen3.5-9B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information