aastalll

Qwen3.5-35B-A3B-NVFP4-MTP

README

License: apache-2.0

As of 2/27/2026, this model is supported in vLLM nightly. To serve the model:

bash
vllm serve Kbenkhaled/Qwen3.5-35B-A3B-NVFP4 \
    --reasoning-parser qwen3 \
    --enable-prefix-caching

Evaluated with lm-evaluation-harness, 0-shot, thinking mode ON.

Table with columns: Benchmark, Qwen3.5-35B-A3B, Qwen3.5-35B-A3B-NVFP4 (this model), Recovery
Benchmark	Qwen3.5-35B-A3B	Qwen3.5-35B-A3B-NVFP4 (this model)	Recovery
GPQA Diamond	81.31%	80.81%	99.4%
IFEval	95.56%	92.93%	97.2%
MMLU-Redux	92.51%	92.31%	99.8%
Average	89.79%	88.68%	98.8%

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Model Details

Model Provider

aastalll

Model Tree

Base

Qwen/Qwen3.5-35B-A3B

Quantized

this model

Input Modalities

Text

Image

Video

Output Modalities

Text

Supported Functionality

Dedicated Endpoints