kushalicious
research-slm-360m-lora
Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Container
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Training
| Parameter | Value |
|---|---|
| Base model | SmolLM2-360M-Instruct |
| Method | LoRA (r=16, alpha=32) via Unsloth |
| Data | research-slm-dataset — 15k train / 500 eval |
| Hardware | Google Colab free T4 |
| Steps | 250 (3k examples subsampled) |
Evaluation (rule-based, 30 examples)
| Model | Overall |
|---|---|
| Base | 66.1% |
| This adapter | 67.8% |
Usage
python
from peft import PeftModelfrom transformers import AutoModelForCausalLM, AutoTokenizerimport torchbase = "HuggingFaceTB/SmolLM2-360M-Instruct"adapter = "kushalicious/research-slm-360m-lora"tokenizer = AutoTokenizer.from_pretrained(base)model = AutoModelForCausalLM.from_pretrained(base, torch_dtype=torch.float16, device_map="auto")model = PeftModel.from_pretrained(model, adapter)
Or use the full research loop from the GitHub repo:
bash
huggingface-cli download kushalicious/research-slm-360m-lora --local-dir lora_adapterpython -m runtime.main "Your research question" --adapter lora_adapter
GitHub
Full code, eval scripts, and Colab notebook: github.com/kushalicious/research-slm
Model provider
kushalicious
Model tree
Base
HuggingFaceTB/SmolLM2-360M-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information