prithivMLmods
ultraqwen3.5-4b-heretic-uncensored
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Key Highlights
- Heretic-Based Abliteration: Modified using the Heretic toolkit to identify and alter refusal-related representations within the model.
- Reduced Refusal Behavior: Optimized to minimize internal refusal tendencies while maintaining instruction-following capabilities.
- Qwen 3.5 Backbone: Built directly on top of Qwen/Qwen3.5-4B.
- Reasoning-Oriented Performance: Preserves multi-step reasoning and analytical capabilities after abliteration.
- Research-Focused Release: Designed for alignment research, model behavior analysis, and evaluation of refusal-direction modifications.
- Efficient 4B Deployment: Suitable for local inference, research environments, and optimized deployment setups.
Abliteration Parameters
| Parameter | Value |
|---|---|
| direction_index | 18.36 |
| attn.o_proj.max_weight | 1.27 |
| attn.o_proj.max_weight_position | 19.73 |
| attn.o_proj.min_weight | 0.73 |
| attn.o_proj.min_weight_distance | 11.05 |
| mlp.down_proj.max_weight | 1.33 |
| mlp.down_proj.max_weight_position | 25.44 |
| mlp.down_proj.min_weight | 0.74 |
| mlp.down_proj.min_weight_distance | 14.88 |
Refusal Evaluation
| Metric | This model | Original model (Qwen/Qwen3.5-4B) |
|---|---|---|
| Refusals | 2/100 | 99/100 |
Quick Start with Transformers
bash
pip install transformerspip install accelerate
python
from transformers import AutoTokenizer, AutoModelForCausalLMimport torchmodel = AutoModelForCausalLM.from_pretrained("prithivMLmods/ultraqwen3.5-4b-heretic-uncensored",torch_dtype="auto",device_map="auto")tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/ultraqwen3.5-4b-heretic-uncensored")messages = [{"role": "user","content": "Explain how a transformer model processes text."}]inputs = tokenizer.apply_chat_template(messages,tokenize=True,add_generation_prompt=True,return_tensors="pt").to(model.device)outputs = model.generate(inputs,max_new_tokens=512)print(tokenizer.decode(outputs[0][inputs.shape[-1]:],skip_special_tokens=True))
GGUF Model Files
| Resource | Link |
|---|---|
prithivMLmods/ultraqwen3.5-4b-heretic-uncensored-gguf | https://huggingface.co/prithivMLmods/ultraqwen3.5-4b-heretic-uncensored-gguf |
Intended Use
- Alignment Research: Studying refusal-direction analysis and behavior modification techniques.
- Model Evaluation: Benchmarking reasoning, instruction-following, and safety-related behaviors.
- Red Teaming: Analyzing model responses under reduced-refusal conditions.
- Local Deployment: Running compact Qwen models in research and experimentation environments.
- Abliteration Studies: Exploring the effects of targeted weight-space modifications on model behavior.
Limitations & Risks
Important Note: This model intentionally reduces built-in refusal mechanisms.
- Sensitive Content Risk: May generate unrestricted, controversial, or unsafe outputs.
- User Responsibility: Requires careful and ethical use.
- Experimental Modifications: Behavior may differ significantly from the original model.
- Alignment Trade-offs: Reduced refusal behavior may impact safety filtering and response constraints.
- Potential Artifacts: Certain prompts may expose unexpected outputs resulting from the abliteration process.
Acknowledgements
-
Qwen/Qwen3.5-4B: A compact 4B-parameter model from the Qwen family, designed for strong reasoning and efficient deployment with long-context support.
-
Heretic: Fully automatic censorship removal framework for language models. This project was used to perform the refusal-direction analysis and ablation procedures that form the foundation of this model.
-
Model Trials & Evaluation: Experimental evaluations, refusal measurements, and optimization trials were conducted and documented at: https://huggingface.co/strangeropshf/demo-job-op-ft-19
Model provider
prithivMLmods
Model tree
Base
Qwen/Qwen3.5-4B
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information