SCAI-JHU
MindZero-gw-tom-Qwen3-VL-8B-Instruct
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0TL;DR
MindZero trains (M)LLMs to perform efficient and robust online mental reasoning without any mental-state annotations. During training, the model is rewarded for generating mental-state hypotheses that maximize the likelihood of observed actions, as estimated by a planner — analogous to model-based ToM reasoning. After training, MindZero internalizes this reasoning into fast single-pass inference.
Evaluation
| Base model | Checkpoint | Gridworld-QA |
|---|---|---|
| Qwen/Qwen3-VL-4B-Instruct | MindZero-gw-tom-Qwen3-VL-4B-Instruct | 95.0 |
| Qwen/Qwen3-VL-8B-Instruct | MindZero-gw-tom-Qwen3-VL-8B-Instruct | 92.3 |
Citation
bibtex
@inproceedings{zhang2026mindzero,title = {MindZero: Learning Online Mental Reasoning With Zero Annotations},author = {Shunchi Zhang and Jin Lu and Chuanyang Jin and Yichao Zhou and Zhining Zhang and Tianmin Shu},booktitle = {Proceedings of the 43rd International Conference on Machine Learning (ICML)},year = {2026}}
Model provider
SCAI-JHU
Model tree
Base
Qwen/Qwen3-VL-8B-Instruct
Fine-tuned
this model
Modalities
Input
Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information