qwen3vl-8b-fp8-text-only-en API & Inference Endpoint

Modifications

Vision removed: All model.visual.* tensors (351 tensors) were dropped, leaving only the text decoder (36 layers, 903 tensors).
English-only vocab: Non-English tokens (CJK, Cyrillic, Arabic, etc.) were pruned from the tokenizer and embedding matrix. Vocab reduced from 151,936 to 105,785.
FP8 preserved: The original comfy_quant and weight_scale metadata is intact. No requantization was performed.

Base model

Original: Qwen/Qwen3-VL-8B

Files

model.safetensors — text-only weights (BF16 embeds + FP8 layer weights)
tokenizer.json — pruned BPE tokenizer
config.json — Qwen3ForCausalLM config with updated vocab_size

Usage

python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("mindqtrl/qwen3vl-8b-fp8-text-only-en")
tokenizer = AutoTokenizer.from_pretrained("mindqtrl/qwen3vl-8b-fp8-text-only-en")

Stats

Table with columns: Metric, Value
Metric	Value
Original size	10.59 GB
Text-only size	9.44 GB
English-only size	8.68 GB
Vocab (original)	151,936
Vocab (pruned)	105,785
Layers	36
Hidden size	4096
Attention heads	32
KV heads	8

Modifications

Vision removed: All model.visual.* tensors (351 tensors) were dropped, leaving only the text decoder (36 layers, 903 tensors).
English-only vocab: Non-English tokens (CJK, Cyrillic, Arabic, etc.) were pruned from the tokenizer and embedding matrix. Vocab reduced from 151,936 to 105,785.
FP8 preserved: The original comfy_quant and weight_scale metadata is intact. No requantization was performed.

Base model

Original: Qwen/Qwen3-VL-8B

Files

model.safetensors — text-only weights (BF16 embeds + FP8 layer weights)
tokenizer.json — pruned BPE tokenizer
config.json — Qwen3ForCausalLM config with updated vocab_size

Usage

python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("mindqtrl/qwen3vl-8b-fp8-text-only-en")
tokenizer = AutoTokenizer.from_pretrained("mindqtrl/qwen3vl-8b-fp8-text-only-en")

Stats

Table with columns: Metric, Value
Metric	Value
Original size	10.59 GB
Text-only size	9.44 GB
English-only size	8.68 GB
Vocab (original)	151,936
Vocab (pruned)	105,785
Layers	36
Hidden size	4096
Attention heads	32
KV heads	8

qwen3vl-8b-fp8-text-only-en

README

Modifications

Base model

Files

Usage

Stats

Explore FriendliAI today

README

Modifications

Base model

Files

Usage

Stats