Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

v1.1 — thinking-mode fix

Gemma 4 is a thinking model: its refusal decision forms inside the chain-of-thought. The first release was abliterated/evaluated with thinking disabled, so it still refused once thinking was on (the default). v1.1 is re-tuned to decensor the model with thinking enabled — the way it's actually used.

Abliteration parameters

ParameterValue
attn.o_proj.max_weight1.48
attn.o_proj.max_weight_position36.55
attn.o_proj.min_weight1.29
attn.o_proj.min_weight_distance21.18
mlp.down_proj.max_weight1.48
mlp.down_proj.max_weight_position31.62
mlp.down_proj.min_weight1.43
mlp.down_proj.min_weight_distance12.60

Performance

MetricThis model (v1.1)Original
KL divergence0.320 (by definition)
Refusals, thinking on (adversarial harmful set)~22%~99%

KL is high relative to a typical abliteration — that is the cost of suppressing refusal through the reasoning trajectory of a thinking model. The model complies with the vast majority of requests in normal use; a small fraction of extreme prompts may still be refused, and a stronger v2 is in progress.


This is the QAT-Q4_0 "unquantized" checkpoint of Gemma 4 12B (plain bf16 weights with QAT calibration baked in), decensored with Heretic. GGUF quantizations are available at igorls/gemma-4-12B-it-qat-q4_0-unquantized-heretic-GGUF.

Disclaimer

This is an abliterated model: its built-in safety guardrails and refusal behavior have been deliberately removed. As a result it will attempt to answer essentially any prompt and can produce content that is offensive, inaccurate, explicit, or otherwise harmful — content the original Gemma 4 would have refused.

  • No safety alignment. Do not rely on it to refuse unsafe requests or to self-moderate. Apply your own filtering, guardrails, and human review before any production or user-facing use.
  • You are solely responsible for how you use this model and for complying with all applicable laws and with the base model's Gemma 4 license (Apache 2.0).
  • Intended for adults (18+), for research, evaluation, and lawful creative use where permitted.
  • Provided as-is, without warranty of any kind. The author accepts no liability for any output produced or any use of this model.

Model provider

liskasYR

Model tree

Base

google/gemma-4-12B-it-qat-q4_0-unquantized

Fine-tuned

this model

Modalities

Input

Video, Audio, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today