hollow404

MDS-VQA-Active-Finetuning

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Model Details

Model type: no-reference video quality assessment vision-language model
Checkpoint type: PEFT / LoRA adapter for active fine-tuning
Backbone family: Qwen2.5-VL / VisualQuality-R1-style VLM
Base model: hollow404/VQR1-7B-YouTubeUGC
LoRA rank: 64
LoRA alpha: 128
LoRA dropout: 0.05
Training data: YouTube-UGC + MDS-VQA-selected labeled samples from YouTube-SFV SDR
Input: a video plus a VQA prompt
Output: a quality score on a 1 to 5 scale, typically inside <answer>...</answer> tags
License: Apache 2.0

Intended Use

This model is intended for research on no-reference video quality assessment, active data selection, and target-domain adaptation for VQA. Typical uses include:

evaluating the active fine-tuning stage of the MDS-VQA pipeline;
predicting perceptual quality scores for YouTube-SFV SDR;
comparing active fine-tuning against the YouTube-UGC baseline model;
studying how model-informed data selection improves VQA generalization.

This checkpoint should be used together with the base model. It is not intended as a universal production QoE monitor without domain-specific validation.

Prompt Format

The model follows the VisualQuality-R1-style scoring prompt used in MDS-VQA:

text
You are doing the video quality assessment task.
Here is the question: What is your overall rating on the quality of this video? The rating should be a float between 1 and 5, rounded to two decimal places, with 1 representing very poor quality and 5 representing excellent quality.
First output the thinking process in <think> </think> tags and then output the final answer with only one score in <answer> </answer> tags.

For automatic evaluation, parse the scalar value inside the final <answer> tag.

MDS-VQA Context

MDS-VQA is a model-informed data selection mechanism for VQA. Given an unlabeled target video pool, it selects videos that are both:

Difficult for the base VQA model: estimated by a failure predictor trained to rank videos by the base model's prediction errors. Diverse in content: estimated from semantic video features, using a diversity-aware greedy selection procedure. The selected videos are then labeled and merged with the original labeled source dataset for active fine-tuning. This repository provides the resulting active fine-tuning checkpoint.

Citation

If you use this model, please cite MDS-VQA:

bibtext
@article{zou2026mds,
  title={MDS-VQA: Model-Informed Data Selection for Video Quality Assessment},
  author={Zou, Jian and Xu, Xiaoyu and Wang, Zhihua and Wang, Yilin and Adsumilli, Balu and Ma, Kede},
  journal={arXiv preprint arXiv:2603.11525},
  year={2026}
}

Model provider

hollow404

Model tree

Base

hollow404/VQR1-7B-YouTubeUGC

Adapter

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Model Details

Model type: no-reference video quality assessment vision-language model
Checkpoint type: PEFT / LoRA adapter for active fine-tuning
Backbone family: Qwen2.5-VL / VisualQuality-R1-style VLM
Base model: hollow404/VQR1-7B-YouTubeUGC
LoRA rank: 64
LoRA alpha: 128
LoRA dropout: 0.05
Training data: YouTube-UGC + MDS-VQA-selected labeled samples from YouTube-SFV SDR
Input: a video plus a VQA prompt
Output: a quality score on a 1 to 5 scale, typically inside <answer>...</answer> tags
License: Apache 2.0

Intended Use

This model is intended for research on no-reference video quality assessment, active data selection, and target-domain adaptation for VQA. Typical uses include:

evaluating the active fine-tuning stage of the MDS-VQA pipeline;
predicting perceptual quality scores for YouTube-SFV SDR;
comparing active fine-tuning against the YouTube-UGC baseline model;
studying how model-informed data selection improves VQA generalization.

This checkpoint should be used together with the base model. It is not intended as a universal production QoE monitor without domain-specific validation.

Prompt Format

The model follows the VisualQuality-R1-style scoring prompt used in MDS-VQA:

text
You are doing the video quality assessment task.
Here is the question: What is your overall rating on the quality of this video? The rating should be a float between 1 and 5, rounded to two decimal places, with 1 representing very poor quality and 5 representing excellent quality.
First output the thinking process in <think> </think> tags and then output the final answer with only one score in <answer> </answer> tags.

For automatic evaluation, parse the scalar value inside the final <answer> tag.

MDS-VQA Context

MDS-VQA is a model-informed data selection mechanism for VQA. Given an unlabeled target video pool, it selects videos that are both:

Citation

If you use this model, please cite MDS-VQA:

bibtext
@article{zou2026mds,
  title={MDS-VQA: Model-Informed Data Selection for Video Quality Assessment},
  author={Zou, Jian and Xu, Xiaoyu and Wang, Zhihua and Wang, Yilin and Adsumilli, Balu and Ma, Kede},
  journal={arXiv preprint arXiv:2603.11525},
  year={2026}
}

MDS-VQA-Active-Finetuning

Get help setting up a custom Dedicated Endpoints.

README

Model Details

Intended Use

Prompt Format

MDS-VQA Context

Citation

Explore FriendliAI today

README

Model Details

Intended Use

Prompt Format

MDS-VQA Context

Citation