Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Evaluation [Self Reported]

MetricResult
Refusal Rate (harm_bench)0 / 250
Test Setup250 random harmful prompts
Inference PipelineTransformers
Inference Typetext-generation
Datasetharm_bench

Note: This model was tested on 250 randomly sampled harmful prompts based on the harm_bench dataset. The result shows 0 refusals out of 250. For more details, refer to the dataset page linked above.

Key Highlights

  • Heretic Stable Training: Refined to reduce internal refusal behaviors while improving response stability and coherent long-form multimodal generation.
  • 8B Multimodal Architecture: Based on Qwen3-VL-8B-Instruct, delivering strong vision-language understanding and detailed reasoning capabilities.
  • Enhanced Visual Reasoning: Optimized for deep analysis of artistic, technical, forensic, abstract, and research-oriented visual content.
  • High-Fidelity Captioning: Generates rich and descriptive captions suitable for metadata generation, accessibility pipelines, and dataset enrichment.
  • Dynamic Resolution Handling: Maintains native Qwen3-VL support for multiple aspect ratios and high-resolution image processing.
  • Stable Instruction Following: Tuned to preserve conversational coherence and reduce generation instability during extended reasoning tasks.

Quick Start with Transformers

bash

pip install transformers==5.9.0
# or
pip install git+https://github.com/huggingface/transformers.git

python

from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch
# Load the Heretic Stable model
model = Qwen3VLForConditionalGeneration.from_pretrained(
"prithivMLmods/Qwen3-VL-8B-Heretic-Stable",
torch_dtype="auto",
device_map="auto"
)
processor = AutoProcessor.from_pretrained(
"prithivMLmods/Qwen3-VL-8B-Heretic-Stable"
)
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
},
{
"type": "text",
"text": "Provide a detailed caption and reasoning for this image."
},
],
}
]
text = processor.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
text=[text],
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors="pt",
).to("cuda")
generated_ids = model.generate(
**inputs,
max_new_tokens=256
)
generated_ids_trimmed = [
out_ids[len(in_ids):]
for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed,
skip_special_tokens=True,
clean_up_tokenization_spaces=False
)
print(output_text)

Intended Use

  • Advanced Multimodal Research: Exploring reasoning behavior and multimodal robustness across diverse prompts.
  • Visual Dataset Enrichment: Producing detailed captions for historical, artistic, scientific, or technical datasets.
  • Behavioral Alignment Research: Studying the effects of refusal-reduction and abliteration-based fine-tuning strategies.
  • Creative Vision-Language Applications: Supporting storytelling, world-building, visual narration, and scene interpretation workflows.

Limitations & Risks

Important Notice: This model intentionally minimizes conventional refusal mechanisms.

  • Sensitive Output Generation: The model may produce explicit, controversial, or unrestricted outputs depending on prompts.
  • User Responsibility: Outputs should be used responsibly and in accordance with applicable legal and ethical standards.
  • Large Hardware Requirements: High-resolution multimodal inference may require substantial GPU memory and compute resources.

Model Lineage

  • Base Model: Qwen/Qwen3-VL-8B-Instruct
  • Intermediate Variant: prithivMLmods/Qwen3-VL-8B-Instruct-Unredacted-MAX
  • Current Release: prithivMLmods/Qwen3-VL-8B-Heretic-Stable

Acknowledgements

I would like to thank the works of the following:

Model provider

prithivMLmods

prithivMLmods

Model tree

Base

prithivMLmods/Qwen3-VL-8B-Instruct-Unredacted-MAX

Fine-tuned

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today