Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0About the project
The Tragedy of the Group Chat was created for the Hugging Face Build-Small Hackathon (June 2026).
The project explored how much personality and structure can be taught to a relatively small local model through careful fine-tuning and iterative evaluation.
Rather than building a general-purpose assistant, the goal was to transform tiny modern inconveniences into exaggerated theatrical comedy scenes.
Repository structure
The project consists of three related repositories:
- LoRA adapter: The original PEFT fine-tuning.
- Merged model (this repository): A standalone Transformers version of the model.
- GGUF edition: A quantised deployment build for llama.cpp and local inference.
What does it do?
Given a small modern grievance, the model produces a short comic scene in the style of a badly organised Elizabethan theatre company.
Typical outputs include:
- TITLE
- DRAMATIS PERSONAE
- SCENE
- THOU MUST CHOOSE
The intended voice combines influences from:
- Shakespeare
- Blackadder
- Monty Python
- British sitcoms
- Amateur dramatic societies
Performance
The underlying fine-tuned model achieved 56/80 on a held-out ten-prompt manual benchmark. The base Qwen2.5-3B-Instruct model scored 36/80 using the same evaluation procedure.
The merged model preserves the behaviour of the LoRA adapter while simplifying deployment.
Intended use
The model is intended for entertainment and creative text generation.
It performs best on:
- everyday annoyances
- social awkwardness
- household disasters
- transport failures
- office politics
- mildly haunted appliances
- inexplicably judgemental animals
Limitations
The model intentionally prioritises style over factual accuracy.
Recurring characters and running jokes are expected behaviour.
The model performs best on small frustrations rather than major life events.
Loading
This repository contains a standalone Transformers model and can be loaded directly with the Hugging Face Transformers library.
Build process
This model was created by:
Qwen2.5-3B-Instruct
↓
LoRA fine-tuning
↓
PEFT merge-and-unload
↓
Standalone Transformers model
This merged model serves as the source for the GGUF deployment build.
Try the model
Related repositories
Model provider
sdavies
Model tree
Base
Qwen/Qwen2.5-3B-Instruct
Fine-tuned
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information