Qwen
Qwen3-235B-A22B-Instruct-2507
Features significant gains in instruction following, reasoning, coding, and multilingual capabilities with native 256K context extendable to 1M tokens.
Highlights
Introducing the updated version of the Qwen3-235B-A22B non-thinking mode, named Qwen3-235B-A22B-Instruct-2507, featuring the following key enhancements:
- Significant improvements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage.
- Substantial gains in long-tail knowledge coverage across multiple languages.
- Markedly better alignment with user preferences in subjective and open-ended tasks, enabling more helpful responses and higher-quality text generation.
- Enhanced capabilities in 256K long-context understanding.
Model Overview
Qwen3-235B-A22B-Instruct-2507 has the following features:
- Type: Causal Language Models
- Training Stage: Pretraining & Post-training
- Number of Parameters: 235B in total and 22B activated
- Number of Parameters (Non-Embedding): 234B
- Number of Layers: 94
- Number of Attention Heads (GQA): 64 for Q and 4 for KV
- Number of Experts: 128
- Number of Activated Experts: 8
- Context Length: 262,144 natively and extendable up to 1,010,000 tokens
NOTE: This model supports only non-thinking mode and does not generate <think></think> blocks in its output. Meanwhile, specifying enable_thinking=False is no longer required.
Run this model inference with a simple API call.
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
API Example
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
Model provider
Qwen
Model tree
Base
this model
Modalities
Input
Text
Output
Text
Pricing
Serverless Endpoints
Input
$0.2 / 1M tokens
Output
$0.8 / 1M tokens
Dedicated Endpoints
View detailsSupported Functionality
Serverless Endpoints
Dedicated Endpoints
Container
More information