Abdullahkousa2
sqlforge-qwen2.5-coder-1.5b
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: mitResults (full Spider dev set, 1034 examples)
| Execution accuracy | Crashing queries | |
|---|---|---|
| Base Qwen2.5-Coder-1.5B (zero-shot) | 57.45% | 228 |
| + this adapter | 65.57% | 148 |
| +8.1 pts | −35% |
Usage
python
from transformers import AutoModelForCausalLM, AutoTokenizerfrom peft import PeftModelbase = "Qwen/Qwen2.5-Coder-1.5B-Instruct"model = AutoModelForCausalLM.from_pretrained(base, device_map="auto")model = PeftModel.from_pretrained(model, "Abdullahkousa2/sqlforge-qwen2.5-coder-1.5b")tok = AutoTokenizer.from_pretrained(base)messages = [{"role": "system", "content": "You are an expert data analyst. Given a SQLite ""database schema and a question, write a single valid SQLite SQL query that ""answers it. Respond with only the SQL query and nothing else."},{"role": "user", "content": 'Database schema:\nCREATE TABLE singer ("Name" text, "Age" int);\n\nQuestion: How many singers are there?'},]prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)out = model.generate(**tok(prompt, return_tensors="pt").to(model.device), max_new_tokens=128)print(tok.decode(out[0], skip_special_tokens=True).split("assistant")[-1].strip())# -> SELECT count(*) FROM singer
Or with the sqlforge package:
bash
pip install sqlforgesqlforge -q "How many singers are there?" --db mydata.sqlite --run
Training
- Method: QLoRA — 4-bit NF4 base + LoRA (r=16, α=32, dropout=0.05) on all attention + MLP projections
- Schedule: 3 epochs, lr 2e-4 cosine, effective batch size 16, bf16, paged AdamW 8-bit
- Hardware: a single RTX 3070 (8GB)
Links
- 💻 Code & training pipeline: github.com/abdullahkousa2/sqlforge
- 🤗 Live demo: huggingface.co/spaces/Abdullahkousa2/sqlforge
- 📈 Training run: Weights & Biases
Limitations
A 1.5B model. Its main failure is over-joining — building an unnecessary JOIN and referencing a column on the wrong table. Fine-tuning cut this by a third but didn't eliminate it. State-of-the-art (~90%) requires a frontier model inside an agentic pipeline; a locally-trained 1.5B realistically tops out in the 60s–70s.
Framework versions
PEFT 0.19.1 · TRL 1.5.1 · Transformers 4.57.6 · PyTorch 2.7.0+cu128
Model provider
Abdullahkousa2
Model tree
Base
Qwen/Qwen2.5-Coder-1.5B-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information