Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
Build
- Heretic version:
1.3.0 - Quantization during Heretic:
none - Selected trial:
64 - Export type: full merged safetensors
- Refusals:
5/100 - KL divergence:
0.0070
Notes
The model was smoke-tested with Transformers using enable_thinking=False.
Manual checks covered Korean instruction following, long-answer repetition,
unnecessary clarification behavior, JSON-format following, long-context recall,
and repeated sampling consistency.
Model provider
kang926
Model tree
Base
Jackrong/Qwopus3.6-27B-v2
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information