swohamkayastha
legacyscribe-9b-v3-lora
Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Five-agent pipeline
- Questioner — warm follow-up questions to draw out more story
- Arc classifier — tags narrative stage (setup / tension / turn / meaning)
- Extractors — pulls person, place, time, emotion as structured JSON
- Reconciler — detects contradictions across memory notes
- Publisher — synthesizes notes into a warm first-person narrative passage
Usage (GGUF / Ollama)
bash
ollama run hf.co/your-username/legacyscribe-9b-v3-gguf
Training
- Base: Qwen3.5-9B (4-bit, Unsloth + LoRA)
- LoRA r=16, alpha=32, 5 epochs
- Dataset: custom Nepali/South Asian memory examples """
with open("./models/legacyscribe-9b-v3-gguf/README.md", "w") as f: f.write(readme)
Model provider
swohamkayastha
Model tree
Base
unsloth/Qwen3.5-9B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information