prithivMLmods

prithivMLmods

Q3.5-9B-GLM-5.1-DA

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Key Highlights

  • GLM-5.1 Reasoning Distillation: Fine-tuned using high-quality reasoning traces derived from GLM-5.1 datasets with a strong focus on mathematical and long-context reasoning.
  • Distilled-Abliterated (DA): Applies refusal direction analysis and ablation-based strategies to reduce internal refusal behaviors while maintaining reasoning quality.
  • Qwen3.5 Backbone: Built on top of Qwen/Qwen3.5-9B via prithivMLmods/Qwen3.5-9B-Unredacted-MAX for strong instruction-following and reasoning performance.
  • Long-Context Mathematical Reasoning: Optimized for multi-step mathematical problem solving, logical decomposition, and extended reasoning chains.
  • Instruction + Reasoning Fusion: Handles instruction-following and complex reasoning tasks seamlessly.
  • Efficient 9B Deployment: Suitable for local inference and quantized deployment setups with lower hardware requirements compared to larger-scale models.

Datasets Used and Training Details

Table
CategoryDetails
Base ModelQwen/Qwen3.5-9B
Intermediate BaseprithivMLmods/Qwen3.5-9B-Unredacted-MAX
Final Model Size9B Parameters
Training TypeDistillation + abliteration
ObjectivePreserve long-context reasoning quality while reducing refusal behaviors and improving mathematical reasoning reliability
Reasoning DatasetJackrong/GLM-5.1-Reasoning-1M-Cleaned (Subset-Math, 5000 random samples used)
Alignment / Evaluation DatasetprithivMLmods/harm_bench
Training PipelineTRL (Transformer Reinforcement Learning)
Training FocusLong-context reasoning, mathematical problem solving, logical decomposition, structured chain-of-thought generation

Quick Start with Transformers

bash

pip install transformers==5.8.0
# or latest
pip install git+https://github.com/huggingface/transformers.git

python

from transformers import Qwen3_5ForConditionalGeneration, AutoProcessor
import torch
model = Qwen3_5ForConditionalGeneration.from_pretrained(
"prithivMLmods/Q3.5-9B-GLM-5.1-DA",
torch_dtype="auto",
device_map="auto"
)
processor = AutoProcessor.from_pretrained(
"prithivMLmods/Q3.5-9B-GLM-5.1-DA"
)
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Solve this step-by-step: If a train travels 240 km in 3 hours, what is its average speed?"
}
],
}
]
text = processor.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = processor(
text=[text],
padding=True,
return_tensors="pt"
).to("cuda")
generated_ids = model.generate(
**inputs,
max_new_tokens=512
)
generated_ids_trimmed = [
out_ids[len(in_ids):]
for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed,
skip_special_tokens=True,
clean_up_tokenization_spaces=False
)
print(output_text)

Intended Use

  • Mathematical Reasoning: Multi-step arithmetic, algebraic reasoning, and logical problem solving
  • Long-Context Tasks: Extended reasoning chains and context-heavy instruction following
  • Instruction Following: Hybrid prompts combining reasoning and structured responses
  • Research on Abliteration: Studying the impact of refusal reduction techniques on reasoning preservation
  • Alignment & Red-Teaming Research: Evaluating reduced-refusal systems under complex reasoning scenarios

Limitations & Risks

Important Note: This model intentionally minimizes built-in safety refusals.

  • Sensitive Content Risk: May produce unrestricted or controversial outputs
  • User Responsibility: Requires careful and ethical usage
  • Mathematical Hallucinations: Complex reasoning tasks may still contain logical or numerical inconsistencies
  • Abliteration Trade-offs: Reduced refusal behaviors may impact safety alignment and output filtering
  • High Compute Demand: Optimized inference or quantization may still be required for efficient deployment

Model provider

prithivMLmods

prithivMLmods

Model tree

Base

prithivMLmods/Qwen3.5-9B-Unredacted-MAX

Fine-tuned

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today