shreyash-pandey-katni

SQLForge-Mistral-7B-QLoRA

README

License: apache-2.0

Results

Evaluated on a held-out internal text-to-SQL test set (500 examples, schema-aware prompt):

Table with columns: Model, Exact Set Match, Valid SQL
Model	Exact Set Match	Valid SQL
`mistralai/Mistral-7B-v0.3` (base)	9.2%	86.2%
SQLForge-Mistral-7B-QLoRA	87.0%	97.4%

A 9.5× improvement in exact-match accuracy and a +11.2 pp lift in syntactic validity, delivered by a 336 MB LoRA adapter.

Usage

python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

BASE  = "mistralai/Mistral-7B-v0.3"
ADAPT = "shreyash-pandey-katni/SQLForge-Mistral-7B-QLoRA"

tok = AutoTokenizer.from_pretrained(BASE)
model = AutoModelForCausalLM.from_pretrained(
    BASE,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, ADAPT)
model.eval()

schema   = "CREATE TABLE employees (id INT, name TEXT, dept TEXT, salary REAL);"
question = "List the names of employees in the Engineering department earning over 100000."

prompt = f"""[INST] You are an expert SQL assistant. Given the schema and question, write a single SQL query.

Schema:
{schema}

Question: {question} [/INST]"""

inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

4-bit inference (single 12 GB GPU)

python
from transformers import BitsAndBytesConfig

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(model, ADAPT)

Training Details

Base model: mistralai/Mistral-7B-v0.3
Method: QLoRA (4-bit NF4 base, BF16 LoRA adapters) via trl.SFTTrainer
Datasets:
- spider — cross-domain text-to-SQL
- b-mc2/sql-create-context — schema-grounded SQL
- Combined: 90,719 train / 3,929 val rows
Prompt format: [INST] … [/INST] (Mistral instruction template). Loss is masked to apply only on tokens after [/INST] (the SQL response).

LoRA configuration

Table with columns: Setting, Value
Setting	Value
Rank `r`	64
Alpha	16 (scale = α/r = 0.25, conservative — preserves reasoning)
Dropout	0.05
RSLoRA	enabled (1/√r scaling for r=64 stability)
Target modules	`q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj`
Bias	none

Training hyperparameters

Table with columns: Setting, Value
Setting	Value
Epochs	3
Effective batch size	8 (1 × 8 grad-accum)
Learning rate	2e-4
LR scheduler	cosine
Warmup ratio	0.03
Precision	BF16
Gradient checkpointing	enabled
Optimizer	AdamW (transformers default)
Checkpoint	(final)

Hardware

GPU: 1× NVIDIA RTX 3080 Ti (12 GB VRAM)
Peak VRAM: ~8.25 GB (4-bit NF4 base + BF16 adapters + grad-ckpt)
Training time: ~4–8 hours

Limitations

Schema sensitivity: trained with explicit CREATE TABLE schemas in the prompt; performance degrades without schema context.
Single-query bias: optimized for single SELECT statements; multi-statement scripts, DDL, and procedural SQL are out of scope.
Dialect: training data is primarily SQLite/ANSI; vendor-specific syntax (T-SQL, PL/pgSQL, BigQuery functions) may not generalize.
English-only: prompts and questions are English; non-English NL queries untested.
No execution safety: generated SQL should be sandboxed/parameterized before running against production databases.

Evaluation

The metrics above use exact_set_match (gold and predicted normalized SQL match exactly as token sets) and valid_sql_pct (predicted SQL parses with sqlparse/sqlite3). See evaluate_sql.py in the training repository for the full eval harness.

Files

Table with columns: File, Purpose
File	Purpose
`adapter_model.safetensors`	LoRA weights (~336 MB)
`adapter_config.json`	PEFT/LoRA configuration
`tokenizer.json` / `tokenizer_config.json`	Mistral tokenizer
`optimizer.pt`, `scheduler.pt`, `rng_state.pth`, `training_args.bin`, `trainer_state.json`

License

Released under Apache 2.0, the same license as the base model mistralai/Mistral-7B-v0.3.

Citation

bibtex
@misc{sqlforge_mistral7b_qlora_2026,
  title  = {SQLForge-Mistral-7B-QLoRA: A QLoRA Text-to-SQL Adapter for Mistral-7B-v0.3},
  author = {Shreyash Pandey Katni},
  year   = {2026},
  url    = {https://huggingface.co/shreyash-pandey-katni/SQLForge-Mistral-7B-QLoRA}
}

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Model Details

Model Provider

shreyash-pandey-katni

Model Tree

Base

mistralai/Mistral-7B-v0.3

Adapter

this model

Input Modalities

Text

Output Modalities