edougawa
Nex-N2-mini-Abliterated-NVFP4
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
⚠️ Safety disclaimer
This model has had its built-in refusal behavior deliberately reduced. As a result it may produce unexpected, offensive, inaccurate, or otherwise harmful output, and may comply with requests that the original model would have refused.
- It is provided by the publisher, edougawa, "as is" and without warranty of any kind, express or implied. Use at your own risk.
- You are solely responsible for how you use this model and for ensuring your use — and any generated output — complies with all applicable laws, regulations, and the terms of the base model's license.
- To the maximum extent permitted by law, the publisher (edougawa), the base-model authors (Nex-AGI), and the Abliterix authors accept no liability for any claim, damages, or other consequences arising from the use of this model or its outputs.
- Outputs do not reflect the views of the publisher (edougawa), the base-model authors (Nex-AGI), or the Abliterix authors. Apply your own safety filtering, human review, and guardrails before any production or user-facing use.
Quantization
- ModelOpt: 0.44.0
- PyTorch: 2.11.0+cu130
- Transformers: 5.12.0
- Format:
nvfp4_experts_only - Calibration samples: 256,128,128
- Calibration sequence length: 2048
- KV cache in checkpoint: unquantized
- Target hardware: NVIDIA GB10 / Blackwell
SM121 - Runtime target: vLLM ModelOpt FP4 loader
The expert-only preset quantizes the dominant MoE expert weights to NVFP4 while retaining attention, embeddings, LM head, vision encoder, and MTP-sensitive weights at their exported higher precision. This choice prioritizes accuracy.
Features
- Text generation
- Image understanding architecture/config preserved
- Video token/config preserved
vLLM
Use only on NVIDIA Blackwell hardware with an NVFP4-capable vLLM build. Review the source model card for its intended use, limitations, and safety notes.
Model provider
edougawa
Model tree
Base
edougawa/Nex-N2-mini-Abliterated
Quantized
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information