edougawa

Nex-N2-mini-Abliterated-NVFP4

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

⚠️ Safety disclaimer

This model has had its built-in refusal behavior deliberately reduced. As a result it may produce unexpected, offensive, inaccurate, or otherwise harmful output, and may comply with requests that the original model would have refused.

It is provided by the publisher, edougawa, "as is" and without warranty of any kind, express or implied. Use at your own risk.
You are solely responsible for how you use this model and for ensuring your use — and any generated output — complies with all applicable laws, regulations, and the terms of the base model's license.
To the maximum extent permitted by law, the publisher (edougawa), the base-model authors (Nex-AGI), and the Abliterix authors accept no liability for any claim, damages, or other consequences arising from the use of this model or its outputs.
Outputs do not reflect the views of the publisher (edougawa), the base-model authors (Nex-AGI), or the Abliterix authors. Apply your own safety filtering, human review, and guardrails before any production or user-facing use.

Quantization

ModelOpt: 0.44.0
PyTorch: 2.11.0+cu130
Transformers: 5.12.0
Format: nvfp4_experts_only
Calibration samples: 256,128,128
Calibration sequence length: 2048
KV cache in checkpoint: unquantized
Target hardware: NVIDIA GB10 / Blackwell SM121
Runtime target: vLLM ModelOpt FP4 loader

The expert-only preset quantizes the dominant MoE expert weights to NVFP4 while retaining attention, embeddings, LM head, vision encoder, and MTP-sensitive weights at their exported higher precision. This choice prioritizes accuracy.

Features

Text generation
Image understanding architecture/config preserved
Video token/config preserved

vLLM

Use only on NVIDIA Blackwell hardware with an NVFP4-capable vLLM build. Review the source model card for its intended use, limitations, and safety notes.

Model provider

edougawa

Model tree

Base

edougawa/Nex-N2-mini-Abliterated

Quantized

this model

Modalities

Input

Video, Text, Image

Output

Text