CoT-aware abliteration
Gemma 4 is a thinking model — its refusal decision forms inside the
chain-of-thought, so a naive abliteration (evaluated without thinking) leaves the
model refusing once thinking is enabled (the default in Ollama / llama.cpp). This
release is abliterated and evaluated with thinking on, scoring the final answer
after the thought block, so it decensors the model the way it's actually used.
Table with columns: Metric, This model, Original| Metric | This model | Original |
|---|
| KL divergence | 0.055 | 0 (by definition) |
| Refusals, thinking on (extreme adversarial set) | ~21% | ~98% |
The ARA-LoRA method reaches this decensoring at far lower KL than single-direction
abliteration (≈6× lower for comparable results), preserving more of the base model's
capability. The model complies with the overwhelming majority of requests in normal
use; only a small fraction of extreme prompts may still be refused. The thinking
trace may still contain caveats even when the final answer complies.
GGUF quantizations:
igorls/gemma-4-E4B-it-qat-q4_0-unquantized-heretic-GGUF.
Disclaimer
This is an abliterated model: its built-in safety guardrails and refusal
behavior have been deliberately removed. As a result it will attempt to answer
essentially any prompt and can produce content that is offensive, inaccurate,
explicit, or otherwise harmful — content the original Gemma 4 would have refused.
- No safety alignment. Do not rely on it to refuse unsafe requests or to
self-moderate. Apply your own filtering, guardrails, and human review before
any production or user-facing use.
- You are solely responsible for how you use this model and for complying
with all applicable laws and with the base model's
Gemma 4 license (Apache 2.0).
- Intended for adults (18+), for research, evaluation, and lawful creative
use where permitted.
- Provided as-is, without warranty of any kind. The author accepts no
liability for any output produced or any use of this model.