liskasYR/gemma-4-12B-it-qat-q4_0-unquantized-heretic API & Inference Endpoint

v1.1 — thinking-mode fix

Gemma 4 is a thinking model: its refusal decision forms inside the chain-of-thought. The first release was abliterated/evaluated with thinking disabled, so it still refused once thinking was on (the default). v1.1 is re-tuned to decensor the model with thinking enabled — the way it's actually used.

Abliteration parameters

Parameter	Value
attn.o_proj.max_weight	1.48
attn.o_proj.max_weight_position	36.55
attn.o_proj.min_weight	1.29
attn.o_proj.min_weight_distance	21.18
mlp.down_proj.max_weight	1.48
mlp.down_proj.max_weight_position	31.62
mlp.down_proj.min_weight	1.43
mlp.down_proj.min_weight_distance	12.60

Performance

Metric	This model (v1.1)	Original
KL divergence	0.32	0 (by definition)
Refusals, thinking on (adversarial harmful set)	~22%	~99%

KL is high relative to a typical abliteration — that is the cost of suppressing refusal through the reasoning trajectory of a thinking model. The model complies with the vast majority of requests in normal use; a small fraction of extreme prompts may still be refused, and a stronger v2 is in progress.

This is the QAT-Q4_0 "unquantized" checkpoint of Gemma 4 12B (plain bf16 weights with QAT calibration baked in), decensored with Heretic. GGUF quantizations are available at igorls/gemma-4-12B-it-qat-q4_0-unquantized-heretic-GGUF.

Disclaimer

This is an abliterated model: its built-in safety guardrails and refusal behavior have been deliberately removed. As a result it will attempt to answer essentially any prompt and can produce content that is offensive, inaccurate, explicit, or otherwise harmful — content the original Gemma 4 would have refused.

No safety alignment. Do not rely on it to refuse unsafe requests or to self-moderate. Apply your own filtering, guardrails, and human review before any production or user-facing use.
You are solely responsible for how you use this model and for complying with all applicable laws and with the base model's Gemma 4 license (Apache 2.0).
Intended for adults (18+), for research, evaluation, and lawful creative use where permitted.
Provided as-is, without warranty of any kind. The author accepts no liability for any output produced or any use of this model.

gemma-4-12B-it-qat-q4_0-unquantized-heretic

Get help setting up a custom Dedicated Endpoints.

README

v1.1 — thinking-mode fix

Abliteration parameters

Performance

Disclaimer

Explore FriendliAI today

gemma-4-12B-it-qat-q4_0-unquantized-heretic