two-tiger

Qwen2.5-VRPRM-7B

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

Model Details

  • Model family: VRPRM
  • Backbone family: Qwen2.5-VL 7B
  • Serialized architecture: Qwen2_5_VLForConditionalGeneration
  • Model type: qwen2_5_vl
  • Weights format: sharded safetensors
  • Recommended library: transformers

Training Summary

The VRPRM paper trains the model with a two-stage recipe:

  1. Supervised fine-tuning cold start on high-quality CoT-PRM data. Open-sourced on VRPRM3.6K.
  2. Reinforcement learning scaling on lower-cost non-CoT PRM data.

Intended Use

This model is intended for research on:

  • Visual process reward modeling
  • Multimodal reasoning evaluation
  • Step-level scoring of visual question answering rationales
  • Best-of-N selection for vision-language model responses

This model is not intended to be used as a standalone assistant.

Usage

Load the model with Hugging Face Transformers from the repository root:

python

from transformers import AutoModelForVision2Seq, AutoProcessor
model_id = "YOUR_USERNAME/VRPRM-Qwen2.5VL-7B"
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForVision2Seq.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto",
trust_remote_code=True,
)

For the complete inference and evaluation pipeline, use the VRPRM project code.

Citation

bibtex

@misc{chen2026vrprmprocessrewardmodeling,
title={VRPRM: Process Reward Modeling via Visual Reasoning},
author={Xinquan Chen and Chongying Yue and Bangwei Liu and Xuhong Wang and Yingchun Wang and Chaochao Lu},
year={2026},
eprint={2508.03556},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2508.03556},
}

Model provider

two-tiger

Model tree

Base

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today