Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0How to use
This is a LoRA adapter, not a standalone model. Load the base model first, then apply the adapter:
python
import torchfrom transformers import AutoTokenizer, AutoModelForCausalLMfrom peft import PeftModelbase_model_id = "Qwen/Qwen2.5-0.5B"adapter_id = "s-m-sharjeel/qwen2.5-0.5b-alpaca-sft-lora"tokenizer = AutoTokenizer.from_pretrained(adapter_id)base = AutoModelForCausalLM.from_pretrained(base_model_id, torch_dtype=torch.float16, device_map="auto")model = PeftModel.from_pretrained(base, adapter_id)model.eval()prompt = "### Instruction:\nWrite a short definition of machine learning.\n\n### Response:\n"inputs = tokenizer(prompt, return_tensors="pt").to(model.device)out = model.generate(**inputs, max_new_tokens=150, do_sample=False,pad_token_id=tokenizer.eos_token_id)print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Prompt format
The model was trained on the Alpaca-style template. For best results, format prompts as:
markdown
### Instruction:{your instruction here}### Response:
If the task includes additional input/context, use:
markdown
### Instruction:{your instruction here}### Input:{context here}### Response:
Training details
| Setting | Value |
|---|---|
| Base model | Qwen/Qwen2.5-0.5B |
| Method | Supervised Fine-Tuning (SFT) with LoRA |
| Dataset | yahma/alpaca-cleaned (5,000-sample subset, seed=42) |
| Train / Validation split | 4,500 / 500 (90/10) |
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| Target modules | q_proj, v_proj, k_proj |
| LoRA dropout | 0.05 |
| Learning rate | 1e-4 |
| LR scheduler | cosine (warmup ratio 0.05) |
| Epochs | 1 |
| Batch size | 4 (gradient accumulation 4 → effective 16) |
| Max sequence length | 512 |
| Precision | fp16 |
| Platform | Kaggle (NVIDIA T4 GPU) |
Evaluation
Evaluated on a held-out set of 10 manually written instruction prompts with reference answers, using BLEU (sacreBLEU) and BERTScore F1. This configuration was selected as the best of 5 Alpaca trials by combined BLEU + BERTScore (validation loss as tie-breaker).
| Model | Mean BLEU | Mean BERTScore F1 |
|---|---|---|
| Base Qwen2.5-0.5B | 6.57 | 0.8854 |
| This model (Alpaca SFT, Trial 3) | 7.27 | 0.8864 |
This represents a +10.7% relative improvement in BLEU over the base model.
Frameworks
- PEFT
- TRL (SFTTrainer)
- Transformers
Authors
Developed as a course assignment for NLP with Deep Learning, Institute of Business Administration (IBA), Karachi.
| Name | ERP |
|---|---|
| Shazain | 27115 |
| Shayan | 26289 |
| Sharjeel | 26932 |
License
Released under the Apache 2.0 license, matching the base model Qwen2.5-0.5B.
Model provider
s-m-sharjeel
Model tree
Base
Qwen/Qwen2.5-0.5B
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information