Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Container
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model Details
- Base model: Qwen/Qwen2.5-Coder-7B-Instruct
- Fine-tuning method: QLoRA (4-bit)
- LoRA rank: 64
- Precision: bf16
- Training samples: 7,731
- Epochs: 3
- Max length: 1024 tokens
Training Data
- Hoglet-33/webdev-coding-dataset
- sahil2801/CodeAlpaca-20k (web-filtered)
- HuggingFaceH4/CodeAlpaca_20K (web-filtered)
- HuggingFaceM4/WebSight (local cache)
Usage
python
from transformers import AutoTokenizer, AutoModelForCausalLMfrom peft import PeftModelimport torchbase = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct",torch_dtype=torch.bfloat16,device_map="auto")tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct")model = PeftModel.from_pretrained(base, "lhordking/webcoder-7b")prompt = "Create a responsive dark mode landing page for a SaaS product"inputs = tokenizer(prompt, return_tensors="pt").to(model.device)output = model.generate(**inputs, max_new_tokens=1024)print(tokenizer.decode(output[0], skip_special_tokens=True))
Example Prompts
"Create a responsive navbar with dark mode toggle""Build a SaaS landing page with hero section and pricing table""Make a login form with email and password validation""Create a portfolio page with project cards and animations"
Limitations
- Best results with HTML/CSS/JS prompts
- Output quality improves with specific, detailed prompts
- May need more training data for complex full-stack applications
Model provider
lhordking
Model tree
Base
Qwen/Qwen2.5-Coder-7B-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information