Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model Overview

This model has been specifically optimized to excel in several key areas:
- Structured Reasoning Quality: Enhanced ability to break down complex problems and think step-by-step.
- Instruction Adherence: Superior capability to follow strict guidelines and constraints provided in prompts.
- Safety-Aligned Behavior: Designed to operate safely in practical assistant and autonomous agent workflows.
- Robustness: Increased resistance against common misalignment patterns and adversarial inputs.
It leverages a rigorous post-training stack that combines supervised reasoning tuning with alignment-oriented optimization, focusing heavily on reliable behavior in real-world applications.
Training Approach
- Base Model:
Qwen/Qwen3.5-4B - Methodology: LoRA-based Supervised Fine-Tuning (SFT) resulting in a merged BF16 checkpoint.
- Reasoning Architecture: Native support and normalization for the
<think>...</think>format to explicitly separate the reasoning process from the final output. - Optimization Focus: Enhancing safety reasoning, maximizing controllability, and ensuring response consistency.
Data
This model was trained on Merlin Research private datasets built from internal R&D pipelines for:
- reasoning reliability improvements,
- instruction-following robustness,
- safety behavior refinement,
- misalignment reduction in applied scenarios.
- Using Anthropic’s framework Bloom&Petri for for better behavioral alignment.
(https://www.anthropic.com/research/petri-open-source-auditing)
Intended Use Cases
This model is particularly well-suited for:
- Building safety-oriented reasoning assistants and chatbots.
- Tasks requiring strict, constrained instruction-following.
- Experimentation in AI alignment, safety research, and robustness testing.
- Agentic workflows where predictable and safe autonomous behavior is required.
GGUF Status
GGUF artifacts are currently in active development and validation.
At this stage, we recommend using the BF16 Transformers checkpoint for stable results. Updated and fully validated GGUF builds will be published in future releases.
For Ollama
bash
ollama create qwen35-safety-thinking-bf16 -f Modelfileollama run qwen35-safety-thinking-bf16
Organization
Designed, developed, and maintained with ❤️ by Merlin Research.
Citation
If you utilize this model in your research or applications, please cite it as follows:
bibtex
@misc{qwen3.5-4b-safety-thinking,author = {Merlin Research},title = {Qwen3.5-4B-Safety-Thinking: A Reasoning and Safety Aligned Model},year = {2026},publisher = {Hugging Face},howpublished = {\url{https://huggingface.co/MerlinSafety/Qwen3.5-4B-Safety-Thinking}},note = {Base model: Qwen/Qwen3.5-4B}}
Model provider
minhnguyent546
Model tree
Base
Qwen/Qwen3.5-4B
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information