Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Container
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0🎯 Model Details
- Developed by: Lakshitha Nuwan
- Model type: Causal Language Model (Fine-tuned LLM)
- Language(s) (NLP): English
- License: Apache 2.0
- Finetuned from model: unsloth/Llama-3.2-3B-Instruct
- Training Framework: Unsloth & PyTorch
🔗 Model Sources
- HuggingFace Repository: lakshitha722/querymind-nl2sql
- Interactive Live Demo: HuggingFace Space Demo
- Experiment Tracking: Weights & Biases (W&B) Dashboard
💻 How to Get Started with the Model
Use the code below to load the model and generate SQL queries using Unsloth (recommended for local GPUs) or standard HuggingFace Transformers.
Inference with Unsloth (Recommended)
python
from unsloth import FastLanguageModelimport torchMODEL_NAME = "lakshitha722/querymind-nl2sql"model, tokenizer = FastLanguageModel.from_pretrained(model_name = MODEL_NAME,max_seq_length = 1024,load_in_4bit = True,dtype = None,)FastLanguageModel.for_inference(model)# 1. Define Prompt TemplatePROMPT_TEMPLATE = """Below is an instruction that describes a task. Write a response that appropriately completes the request.### Instruction:Convert the following natural language question to a SQL query based on the given database schema. Return ONLY the SQL query, nothing else.### Schema:{schema}### Question:{question}### Response:"""# 2. Prepare Inputsschema = "Database: company\nTables: employees (id, name, department, salary, hire_date)"question = "What is the average salary by department?"prompt = PROMPT_TEMPLATE.format(schema=schema, question=question)inputs = tokenizer([prompt], return_tensors="pt").to("cuda")# 3. Generatewith torch.no_grad():outputs = model.generate(**inputs,max_new_tokens = 150,temperature = 0.1,do_sample = False,pad_token_id = tokenizer.eos_token_id,)# 4. Decode Outputinput_length = inputs['input_ids'].shape[1]sql = tokenizer.decode(outputs[0][input_length:], skip_special_tokens=True).strip()print("Generated SQL:", sql)
Model provider
lakshitha722
Model tree
Base
unsloth/Llama-3.2-3B-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information