hirundo-io
Qwen3.5-4B-restrictions-removed-lora
Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Abliteration stats (Heretic keyword metric, 100 harmful prompts)
| Base | Abliterated | |
|---|---|---|
| Refusals | 94/100 | 33/100 |
| KL divergence | — | 0.017 |
Trial 191 from the default Heretic run (200 Optuna trials, seed 1234).
Usage
python
from peft import PeftModelfrom transformers import AutoModelForImageTextToText, AutoProcessorbase = AutoModelForImageTextToText.from_pretrained("Qwen/Qwen3.5-4B",trust_remote_code=True,torch_dtype="auto",device_map="auto",)model = PeftModel.from_pretrained(base,"hirundo-io/Qwen3.5-4B-restrictions-removed-lora",)processor = AutoProcessor.from_pretrained("Qwen/Qwen3.5-4B", trust_remote_code=True)
Merged weights
Full merged weights (no PEFT required) are at hirundo-io/Qwen3.5-4B-restrictions-removed.
Notes
Produced with Heretic v1.4.0.
Model provider
hirundo-io
Model tree
Base
Qwen/Qwen3.5-4B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information