two-tiger
Qwen2.5-VRPRM-7B
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
Model Details
- Model family: VRPRM
- Backbone family: Qwen2.5-VL 7B
- Serialized architecture:
Qwen2_5_VLForConditionalGeneration - Model type:
qwen2_5_vl - Weights format: sharded
safetensors - Recommended library:
transformers
Training Summary
The VRPRM paper trains the model with a two-stage recipe:
- Supervised fine-tuning cold start on high-quality CoT-PRM data. Open-sourced on VRPRM3.6K.
- Reinforcement learning scaling on lower-cost non-CoT PRM data.
Intended Use
This model is intended for research on:
- Visual process reward modeling
- Multimodal reasoning evaluation
- Step-level scoring of visual question answering rationales
- Best-of-N selection for vision-language model responses
This model is not intended to be used as a standalone assistant.
Usage
Load the model with Hugging Face Transformers from the repository root:
python
from transformers import AutoModelForVision2Seq, AutoProcessormodel_id = "YOUR_USERNAME/VRPRM-Qwen2.5VL-7B"processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)model = AutoModelForVision2Seq.from_pretrained(model_id,torch_dtype="auto",device_map="auto",trust_remote_code=True,)
For the complete inference and evaluation pipeline, use the VRPRM project code.
Citation
bibtex
@misc{chen2026vrprmprocessrewardmodeling,title={VRPRM: Process Reward Modeling via Visual Reasoning},author={Xinquan Chen and Chongying Yue and Bangwei Liu and Xuhong Wang and Yingchun Wang and Chaochao Lu},year={2026},eprint={2508.03556},archivePrefix={arXiv},primaryClass={cs.LG},url={https://arxiv.org/abs/2508.03556},}
Model provider
two-tiger
Model tree
Base
this model
Modalities
Input
Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information