Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
Intended Use
- Structured reflective journaling
- Gentle CBT-informed self-reflection
- Identifying emotions, affected life areas, and common cognitive distortions
- Producing a balanced reframe, tiny next step, and reflective question
- Brief follow-up coaching when the user responds with self-critical thoughts
Output Format
For journal analysis, the adapter is trained to produce exactly six sections:
text
=== EMOTIONS ====== LIFE AREAS ====== COGNITIVE DISTORTIONS ====== BALANCED REFRAME ====== TINY NEXT STEP ====== REFLECTION ===
For follow-up chat, the adapter is trained to stay brief, avoid hidden reasoning tags, avoid business-style skill metrics, validate the feeling, separate feeling from evidence, and ask one grounded question.
Training Recipe
- Base model:
openbmb/MiniCPM5-1B-SFT - Method: QLoRA with 4-bit NF4 quantization
- Adapter: rank 16 LoRA on attention projections
- Hardware: Modal NVIDIA A10G
- Training set: 30 structured journal examples and 15 multi-turn coach examples
- Sequence length: 1536 tokens
- App runtime: Hugging Face Space with local model execution only
Safety Notes
The model should respond with supportive reflection, not certainty. It should not diagnose the user, prescribe treatment, provide crisis intervention, or claim to know whether the user's thoughts are objectively true. For immediate danger or crisis situations, users should contact local emergency services or a crisis hotline.
Links
Model provider
build-small-hackathon
Model tree
Base
openbmb/MiniCPM5-1B-SFT
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information