prithivMLmods
Gliese-Qwen3.5-9B-Abliterated-Caption
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Base Model Signatures:
This model has been re-sharded and optimized for the latest Transformers version from the base model: https://huggingface.co/huihui-ai/Huihui-Qwen3.5-9B-abliterated.
Download the model
bash
hf auth login --token <YOUR_HF_TOKEN>hf download prithivMLmods/Gliese-Qwen3.5-9B-Abliterated-Caption
Key Highlights
-
Advanced Refusal Direction Analysis Uses targeted activation analysis to identify and mitigate refusal directions within the model’s latent space.
-
Abliterated Caption Training Fine-tuned for unfiltered and detailed caption generation, enabling comprehensive visual descriptions without excessive refusal behaviors.
-
Optimized Visual Understanding Enhanced to provide rich, context-aware descriptions of scenes, objects, people, and environments.
-
9B Parameter Architecture Built on Qwen3.5-9B, delivering strong multimodal reasoning and improved caption quality while remaining deployable on modern GPUs.
-
High-Fidelity Caption Generation Designed to produce long-form, structured, and semantically detailed captions suitable for dataset generation, annotation, and research.
-
Efficient Deployment Suitable for caption dataset creation, multimodal research, local inference pipelines, and AI development workflows.
Quick Start with Transformers
bash
pip install transformers==5.3.0# orpip install git+https://github.com/huggingface/transformers.git
python
from transformers import Qwen3_5ForConditionalGeneration, AutoProcessorimport torchmodel = Qwen3_5ForConditionalGeneration.from_pretrained("prithivMLmods/Gliese-Qwen3.5-9B-Abliterated-Caption",torch_dtype="auto",device_map="auto")processor = AutoProcessor.from_pretrained("prithivMLmods/Gliese-Qwen3.5-9B-Abliterated-Caption")messages = [{"role": "user","content": [{"type": "text", "text": "Describe this image in extreme detail."}],}]text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)inputs = processor(text=[text],padding=True,return_tensors="pt").to("cuda")generated_ids = model.generate(**inputs, max_new_tokens=512)generated_ids_trimmed = [out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)]output_text = processor.batch_decode(generated_ids_trimmed,skip_special_tokens=True,clean_up_tokenization_spaces=False)print(output_text)
Intended Use
- High-Detail Image Captioning – Generating extremely descriptive captions for images.
- Dataset Generation – Creating large-scale caption datasets for multimodal training.
- Vision-Language Research – Studying multimodal reasoning and captioning behavior.
- Annotation Automation – Assisting in automatic labeling and visual description tasks.
- Local Multimodal AI Deployment – Running powerful captioning models on local GPUs.
Limitations & Risks
Important Note: This model intentionally reduces built-in refusal mechanisms.
- Unfiltered Outputs – The model may generate explicit or controversial captions depending on the input images.
- User Responsibility – Generated outputs should be handled responsibly and within legal and ethical boundaries.
- Model Size Constraints – While strong, a 9B model still has limitations compared to frontier-scale multimodal architectures.
Model provider
prithivMLmods
Model tree
Base
Qwen/Qwen3.5-9B
Quantized
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information