Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0CoT-aware abliteration
Gemma 4 is a thinking model — its refusal decision forms inside the chain-of-thought, so a naive abliteration (evaluated without thinking) leaves the model refusing once thinking is enabled (the default in Ollama / llama.cpp). This release is abliterated and evaluated with thinking on, scoring the final answer after the thought block, so it decensors the model the way it's actually used.
Performance
| Metric | This model | Original |
|---|---|---|
| KL divergence | 0.055 | 0 (by definition) |
| Refusals, thinking on (extreme adversarial set) | ~21% | ~98% |
The ARA-LoRA method reaches this decensoring at far lower KL than single-direction abliteration (≈6× lower for comparable results), preserving more of the base model's capability. The model complies with the overwhelming majority of requests in normal use; only a small fraction of extreme prompts may still be refused. The thinking trace may still contain caveats even when the final answer complies.
GGUF quantizations: igorls/gemma-4-E4B-it-qat-q4_0-unquantized-heretic-GGUF.
Disclaimer
This is an abliterated model: its built-in safety guardrails and refusal behavior have been deliberately removed. As a result it will attempt to answer essentially any prompt and can produce content that is offensive, inaccurate, explicit, or otherwise harmful — content the original Gemma 4 would have refused.
- No safety alignment. Do not rely on it to refuse unsafe requests or to self-moderate. Apply your own filtering, guardrails, and human review before any production or user-facing use.
- You are solely responsible for how you use this model and for complying with all applicable laws and with the base model's Gemma 4 license (Apache 2.0).
- Intended for adults (18+), for research, evaluation, and lawful creative use where permitted.
- Provided as-is, without warranty of any kind. The author accepts no liability for any output produced or any use of this model.
Model provider
igorls
Model tree
Base
google/gemma-4-E4B-it-qat-q4_0-unquantized
Fine-tuned
this model
Modalities
Input
Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information