Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

CoT-aware abliteration

Gemma 4 is a thinking model — its refusal decision forms inside the chain-of-thought, so a naive abliteration (evaluated without thinking) leaves the model refusing once thinking is enabled (the default in Ollama / llama.cpp). This release is abliterated and evaluated with thinking on, scoring the final answer after the thought block, so it decensors the model the way it's actually used.

Performance

MetricThis modelOriginal
KL divergence0.0550 (by definition)
Refusals, thinking on (extreme adversarial set)~21%~98%

The ARA-LoRA method reaches this decensoring at far lower KL than single-direction abliteration (≈6× lower for comparable results), preserving more of the base model's capability. The model complies with the overwhelming majority of requests in normal use; only a small fraction of extreme prompts may still be refused. The thinking trace may still contain caveats even when the final answer complies.

GGUF quantizations: igorls/gemma-4-E4B-it-qat-q4_0-unquantized-heretic-GGUF.

Disclaimer

This is an abliterated model: its built-in safety guardrails and refusal behavior have been deliberately removed. As a result it will attempt to answer essentially any prompt and can produce content that is offensive, inaccurate, explicit, or otherwise harmful — content the original Gemma 4 would have refused.

  • No safety alignment. Do not rely on it to refuse unsafe requests or to self-moderate. Apply your own filtering, guardrails, and human review before any production or user-facing use.
  • You are solely responsible for how you use this model and for complying with all applicable laws and with the base model's Gemma 4 license (Apache 2.0).
  • Intended for adults (18+), for research, evaluation, and lawful creative use where permitted.
  • Provided as-is, without warranty of any kind. The author accepts no liability for any output produced or any use of this model.

Model provider

igorls

Model tree

Base

google/gemma-4-E4B-it-qat-q4_0-unquantized

Fine-tuned

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today