osmapi

osmQwopus3.6-27B-Fable-Agentic

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Benchmarks

Evaluated head-to-head against the base model on identical questions, with an 8192-token budget so step-by-step reasoning is never truncated.

Table
Benchmark	Base	This model
MMLU-Pro (350 Q)	86.86%	88.29%
GSM8K (300 Q)	98.00%	97.33%
GPQA-Diamond (198 Q)	57.58%	57.07%

Knowledge improved, hard reasoning preserved: +1.43 over base on MMLU-Pro, matching it on GSM8K math and on GPQA-Diamond (graduate-level science, ~frontier-class for a 27B) - with agentic tool-calling added on top.

Table
Category	Base	This model
Biology	96.0%	94.0%
Business	78.0%	88.0%
Chemistry	84.0%	80.0%
Computer Science	84.0%	84.0%
Health	82.0%	86.0%
Math	94.0%	94.0%
Physics	90.0%	92.0%
Overall	86.86%	88.29%

Highlights

Top-tier knowledge - 88.29% MMLU-Pro, above the base model
Strong math - 97.33% GSM8K
Agentic tool-calling across multi-turn sessions
Step-by-step reasoning built in

Capabilities

Agentic, multi-turn tool-calling
Chain-of-thought reasoning
Top-tier general knowledge and math
Multimodal (vision + text) architecture

Usage (vLLM)

bash
vllm serve osmapi/osmQwopus3.6-27B-Fable-Agentic --trust-remote-code --reasoning-parser qwen3 --enable-auto-tool-choice --tool-call-parser qwen3_coder --max-model-len 16384