prithivMLmods/gemma-4-12B-it-heretic_decensored API & Inference Endpoint

Key Highlights

Heretic-Based Abliteration: Modified using the Heretic toolkit to identify and alter refusal-related representations within the model.
Reduced Refusal Behavior: Optimized to minimize internal refusal tendencies while maintaining instruction-following capabilities.
Gemma 4 Backbone: Built directly on top of google/gemma-4-12B-it.
Reasoning-Oriented Performance: Preserves multi-step reasoning and analytical capabilities after abliteration.
Research-Focused Release: Designed for alignment research, model behavior analysis, and evaluation of refusal-direction modifications.
12B Scale Deployment: Suitable for local inference, research environments, and optimized deployment setups.

Abliteration Parameters

Table
Parameter	Value
direction_index	29.56
attn.o_proj.max_weight	1.18
attn.o_proj.max_weight_position	39.94
attn.o_proj.min_weight	0.81
attn.o_proj.min_weight_distance	25.73
mlp.down_proj.max_weight	1.37
mlp.down_proj.max_weight_position	46.27
mlp.down_proj.min_weight	0.97
mlp.down_proj.min_weight_distance	21.63

Performance

Table
Metric	This model	Original model (google/gemma-4-12B-it)
KL divergence	0.0366	0 (by definition)
Refusals	34/100	99/100

Quick Start with Transformers

bash
pip install transformers
pip install accelerate

python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model = AutoModelForCausalLM.from_pretrained(
    "prithivMLmods/gemma-4-12B-it-heretic_decensored",
    torch_dtype="auto",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(
    "prithivMLmods/gemma-4-12B-it-heretic_decensored"
)

messages = [
    {
        "role": "user",
        "content": "Explain how a transformer model processes text."
    }
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=512
)

print(
    tokenizer.decode(
        outputs[0][inputs.shape[-1]:],
        skip_special_tokens=True
    )
)

GGUF Model Files

Table
Resource	Link
`prithivMLmods/gemma-4-12B-it-heretic_decensored-GGUF`	https://huggingface.co/prithivMLmods/gemma-4-12B-it-heretic_decensored-GGUF
Quick Start with llama.cpp (Docker)	https://huggingface.co/prithivMLmods/gemma-4-12B-it-heretic_decensored-GGUF#quick-start-with-llamacpp-docker

Intended Use

Alignment Research: Studying refusal-direction analysis and behavior modification techniques.
Model Evaluation: Benchmarking reasoning, instruction-following, and safety-related behaviors.
Red Teaming: Analyzing model responses under reduced-refusal conditions.
Local Deployment: Running high-capacity Gemma 4 models in research and experimentation environments.
Abliteration Studies: Exploring the effects of targeted weight-space modifications on model behavior.

Limitations & Risks

Important Note: This model intentionally reduces built-in refusal mechanisms.

Sensitive Content Risk: May generate unrestricted, controversial, or unsafe outputs.
User Responsibility: Requires careful and ethical use.
Experimental Modifications: Behavior may differ significantly from the original model.
Alignment Trade-offs: Reduced refusal behavior may impact safety filtering and response constraints.
Potential Artifacts: Certain prompts may expose unexpected outputs resulting from the abliteration process.

Acknowledgements

Heretic: Fully automatic censorship removal framework for language models. This project was used to perform the refusal-direction analysis and ablation procedures that form the foundation of this model.
Model Trials & Evaluation: Experimental evaluations, refusal measurements, and optimization trials were conducted and documented at: https://huggingface.co/strangeropshf/demo-TERM-hf-job-01

gemma-4-12B-it-heretic_decensored

Get help setting up a custom Dedicated Endpoints.

README

Key Highlights

Abliteration Parameters

Performance

Quick Start with Transformers

GGUF Model Files

Intended Use

Limitations & Risks

Acknowledgements

Explore FriendliAI today