build-small-hackathon
NeuroBait
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: otherBuild Small Hackathon Submission
NeuroBait is submitted for the Build Small Hackathon.
- Primary track: Backyard AI
- Why this track: NeuroBait focuses on a specific, real, everyday problem - ADHD task initiation - and turns a small model into a practical companion for that moment.
- Bonus quest fit: Well-Tuned, because this repo publishes the fine-tuned LoRA adapter used by the Space.
- Bonus quest fit: Off-Brand, because the app uses custom Gradio UI and anti-shame product copy rather than a default chatbot shell.
- Sponsor fit: Modal-powered, because fine-tuning and generation evaluation were run on Modal GPU infrastructure.
The project follows the hackathon shape: fine-tune a small-enough open model, publish the model on Hugging Face, and deploy a working Gradio app as a Hugging Face Space.
Intended Behavior
NeuroBait should:
- respond in short, natural prose,
- avoid visible labels such as
Micro-action,Hook, orStakes, - avoid guilt framing and productivity shame,
- preserve user agency,
- ask one light question when context is too sparse,
- offer one tiny concrete action when enough context exists.
It should not act as a medical device, diagnostic tool, therapist, emergency support system, or replacement for professional care.
Training Data
Run #4 used a bilingual Indonesian/English conversational dataset:
- 270 train conversations
- 30 eval conversations
- multi-turn
messages[]format - official NeuroBait system prompt prepended to each example
The dataset is intentionally not included in this model repo.
Training Configuration
- Base:
unsloth/gemma-3-12b-it(dense Gemma 3 12B) - Method: 16-bit LoRA, not QLoRA, via Unsloth
- LoRA rank: 16
- LoRA alpha: 16
- LoRA dropout: 0
- Target modules: q/k/v/o/gate/up/down projections
- Epochs: 3
- Learning rate: 2e-4
- Effective batch size: 8
- Max sequence length: 2048
- Scheduler: cosine
- Warmup ratio: 0.05
- Optimizer: adamw 8-bit
- Precision: bf16
- Chat template:
gemma-3 - Response-only markers:
<start_of_turn>user\n/<start_of_turn>model\n - Checkpoints:
save_strategy="no"to avoid the known Unsloth/TRL checkpoint pickle issue
Training ran on Modal with an H100 80GB GPU.
Deployment
The deployed Space runs on Hugging Face ZeroGPU.
Runtime path:
- Gradio Space
transformers+peft- 4-bit bitsandbytes NF4 loading
- base model:
unsloth/gemma-3-12b-it - LoRA adapter:
build-small-hackathon/NeuroBait
Unsloth is used for training, not for Space inference. The dense Gemma 3 12B base
was chosen because it deploys cleanly through the standard
transformers + peft path on ZeroGPU.
Run #4 Results
Training completed 102 steps.
Training summary:
- train conversations: 270
- eval conversations: 30
- train loss: 1.7501
- eval loss: 1.8844
The loss signal should be treated as a weak training diagnostic for this project. NeuroBait is primarily evaluated through generated behavior against the base model.
Generation eval summary over 8 held-out or novel prompts:
- base persona average: 2.25 / 4
- fine-tuned persona average: 4.0 / 4
- base average words: 80.4
- fine-tuned average words: 55.1
- base label leaks: 5
- fine-tuned label leaks: 0
- base action-cue responses: 5
- fine-tuned action-cue responses: 4
Qualitatively, the fine-tuned adapter produced shorter, more conversational responses and did not leak literal structure labels, while the base model leaked labels in 5 of 8 prompts.
Loading
Example adapter loading path:
python
from peft import PeftModelfrom transformers import AutoModelForImageTextToText, AutoTokenizer, BitsAndBytesConfigimport torchbase_model = "unsloth/gemma-3-12b-it"adapter_id = "build-small-hackathon/NeuroBait"tokenizer = AutoTokenizer.from_pretrained(adapter_id)model = AutoModelForImageTextToText.from_pretrained(base_model,quantization_config=BitsAndBytesConfig(load_in_4bit=True,bnb_4bit_compute_dtype=torch.bfloat16,bnb_4bit_quant_type="nf4",),device_map="auto",)model = PeftModel.from_pretrained(model, adapter_id)model.eval()
Limitations
- The model is not a medical or crisis-support system.
- It may still ask a clarifying question when the user expected a direct nudge.
- It may be too brief, too playful, or too gentle for some contexts.
- Long-term personalization should be handled by product logic, not only by the model.
- App-initiated reminders should be handled by the app layer, not generated spontaneously by the model.
Source
Public source repo:
text
https://github.com/Subrata15/NeuroBait-Build-Small-Model
Codex trace dataset:
text
https://huggingface.co/datasets/build-small-hackathon/NeuroBait-Codex-Traces
Model provider
build-small-hackathon
Model tree
Base
unsloth/gemma-3-12b-it
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information