prithivMLmods
Q3.5-9B-GLM-5.1-DA
Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Key Highlights
- GLM-5.1 Reasoning Distillation: Fine-tuned using high-quality reasoning traces derived from GLM-5.1 datasets with a strong focus on mathematical and long-context reasoning.
- Distilled-Abliterated (DA): Applies refusal direction analysis and ablation-based strategies to reduce internal refusal behaviors while maintaining reasoning quality.
- Qwen3.5 Backbone: Built on top of Qwen/Qwen3.5-9B via prithivMLmods/Qwen3.5-9B-Unredacted-MAX for strong instruction-following and reasoning performance.
- Long-Context Mathematical Reasoning: Optimized for multi-step mathematical problem solving, logical decomposition, and extended reasoning chains.
- Instruction + Reasoning Fusion: Handles instruction-following and complex reasoning tasks seamlessly.
- Efficient 9B Deployment: Suitable for local inference and quantized deployment setups with lower hardware requirements compared to larger-scale models.
Datasets Used and Training Details
| Category | Details |
|---|---|
| Base Model | Qwen/Qwen3.5-9B |
| Intermediate Base | prithivMLmods/Qwen3.5-9B-Unredacted-MAX |
| Final Model Size | 9B Parameters |
| Training Type | Distillation + abliteration |
| Objective | Preserve long-context reasoning quality while reducing refusal behaviors and improving mathematical reasoning reliability |
| Reasoning Dataset | Jackrong/GLM-5.1-Reasoning-1M-Cleaned (Subset-Math, 5000 random samples used) |
| Alignment / Evaluation Dataset | prithivMLmods/harm_bench |
| Training Pipeline | TRL (Transformer Reinforcement Learning) |
| Training Focus | Long-context reasoning, mathematical problem solving, logical decomposition, structured chain-of-thought generation |
Quick Start with Transformers
bash
pip install transformers==5.8.0# or latestpip install git+https://github.com/huggingface/transformers.git
python
from transformers import Qwen3_5ForConditionalGeneration, AutoProcessorimport torchmodel = Qwen3_5ForConditionalGeneration.from_pretrained("prithivMLmods/Q3.5-9B-GLM-5.1-DA",torch_dtype="auto",device_map="auto")processor = AutoProcessor.from_pretrained("prithivMLmods/Q3.5-9B-GLM-5.1-DA")messages = [{"role": "user","content": [{"type": "text","text": "Solve this step-by-step: If a train travels 240 km in 3 hours, what is its average speed?"}],}]text = processor.apply_chat_template(messages,tokenize=False,add_generation_prompt=True)inputs = processor(text=[text],padding=True,return_tensors="pt").to("cuda")generated_ids = model.generate(**inputs,max_new_tokens=512)generated_ids_trimmed = [out_ids[len(in_ids):]for in_ids, out_ids in zip(inputs.input_ids, generated_ids)]output_text = processor.batch_decode(generated_ids_trimmed,skip_special_tokens=True,clean_up_tokenization_spaces=False)print(output_text)
Intended Use
- Mathematical Reasoning: Multi-step arithmetic, algebraic reasoning, and logical problem solving
- Long-Context Tasks: Extended reasoning chains and context-heavy instruction following
- Instruction Following: Hybrid prompts combining reasoning and structured responses
- Research on Abliteration: Studying the impact of refusal reduction techniques on reasoning preservation
- Alignment & Red-Teaming Research: Evaluating reduced-refusal systems under complex reasoning scenarios
Limitations & Risks
Important Note: This model intentionally minimizes built-in safety refusals.
- Sensitive Content Risk: May produce unrestricted or controversial outputs
- User Responsibility: Requires careful and ethical usage
- Mathematical Hallucinations: Complex reasoning tasks may still contain logical or numerical inconsistencies
- Abliteration Trade-offs: Reduced refusal behaviors may impact safety alignment and output filtering
- High Compute Demand: Optimized inference or quantization may still be required for efficient deployment
Model provider
prithivMLmods
Model tree
Base
prithivMLmods/Qwen3.5-9B-Unredacted-MAX
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information