malvavisc0

qwen3.5-9b-opus-agent-gptq-int8

README

License: apache-2.0

Usage

python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "malvavisc0/qwen3.5-9b-opus-agent-gptq-int8",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("malvavisc0/qwen3.5-9b-opus-agent-gptq-int8")

Benchmarks

Same benchmarks as the original model:

Table with columns: Model, ARC, ARC/E, BoolQ
Model	ARC	ARC/E	BoolQ
Qwen3.5-9B-Opus-Agent	0.589	0.747	0.901

Notes

Quantized with GPTQ 8-bit using gptqmodel 7.1.0
Act-aware quantization enabled
Compatible with vLLM for efficient inference

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Model Details

Model Provider

malvavisc0

Model Tree

Base

armand0e/Qwen3.5-9B-Opus-Agent

Quantized

this model

Input Modalities

Text

Image

Video

Output Modalities

Text

Supported Functionality

Dedicated Endpoints

Explore FriendliAI today

Get started Talk to an engineer