Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

Intended format

Prompt the model to reason briefly, then emit final Verilog between tags:

text

Thinking:
[BEGIN]
module TopModule(...);
...
endmodule
[DONE]

Use code inside [BEGIN] / [DONE] as final artifact.

Evaluation in this project

Local VerilogEval v2 spec-to-RTL direct single-adapter eval, no retry, no selector, no compiler feedback before final answer:

text

v33 max4096: compile 113/156, pass 80/156 = 51.3%
v33 max1024: compile 101/156, pass 76/156 = 48.7%

These are clean direct adapter results, not agentic pipeline results. The project best practical pipeline remains a separate verifier-selector system.

Loading

python

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = "Qwen/Qwen3.5-9B"
adapter = "Pablo-Flores-Mollinedo/verilog-qwen3.5-9b-v33-thinking-reinforced-lora"
tok = AutoTokenizer.from_pretrained(adapter, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(base, trust_remote_code=True, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

Notes

  • Adapter only; requires the base model license/weights.
  • Generated RTL should be compiled and simulated before use.
  • Benchmark scores are finite-testbench simulation results, not formal proof.

Model provider

Pablo-Flores-Mollinedo

Model tree

Base

Qwen/Qwen3.5-9B

Adapter

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today