flywheel-ai

legal-intake

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Download (one command)

bash
pip install -U huggingface_hub
hf download flywheel-ai/legal-intake                      # full repo (safetensors + GGUF)
hf download flywheel-ai/legal-intake model-q4_k_m.gguf    # just the GGUF

Run

bash
# llama.cpp
llama-server -m model-q4_k_m.gguf -ngl 999
# Ollama (pulls the GGUF straight from HF)
ollama run hf.co/flywheel-ai/legal-intake
# vLLM (serves the safetensors)
vllm serve flywheel-ai/legal-intake

Provenance & honesty

v1.0 is trained on synthetic seed data authored by permissively-licensed local models (Apache/MIT teachers only — never distilled from closed models). On general prompts it is roughly on par with the base; the niche edge sharpens as consented real usage flows through the OpSpot flywheel. Built on Qwen3.6 (Apache-2.0).

Model provider

flywheel-ai

Model tree

Base

Qwen/Qwen3.6-35B-A3B

Quantized

this model

Modalities

Input

Video, Text, Image

Output

Text