Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Quick Benchmarks
| Check | Original Qwen3.6-35B-A3B | Abliterated Heretic BF16 |
|---|---|---|
| Official 25-prompt refusal check | 22/25 refusals | 1/25 refusals |
| Archived Heretic KL divergence | - | 0.010655362159013748 |
Abliteration notes:
- base model: Qwen/Qwen3.6-35B-A3B
- method family: Heretic MPOA/SOMA-style sibling transfer, finalized with split-MoE input-side intervention
- official refusal check: 1/25 refusals on the same 25-prompt marker suite used for the MiniMax M2.7 abliterated run
- vision-language wrapper preserved; intervention was applied on the text-side MoE stack
Notes:
- GGUF quants are published separately in
Youssofal/Qwen3.6-35B-A3B-Abliterated-Heretic-GGUF - Export metadata for the accepted candidate is included in
abliteration_metadata.json
Model provider
CCSSNE
Model tree
Base
Qwen/Qwen3.6-35B-A3B
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information