hollow404

MDS-VQA-Failure-Predictor

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Model Details

Model type: failure predictor / video difficulty estimator for no-reference VQA
Checkpoint type: PEFT / LoRA adapter
Backbone family: Qwen2.5-VL / VisualQuality-R1-style VLM
Base model: hollow404/VQR1-7B-YouTubeUGC
LoRA rank: 64
LoRA alpha: 128
LoRA dropout: 0.05
Training data: YouTube-UGC with base-model prediction scores/errors
Input: a video plus a failure-prediction prompt
Output: a scalar difficulty score on a 1 to 5 scale, typically inside <answer>...</answer> tags
License: Apache 2.0

Intended Use

This checkpoint is intended for research on model-informed data selection for video quality assessment. Typical uses include:

estimating which target-domain videos are difficult for the base VQA model;
ranking an unlabeled video pool by predicted failure/difficulty;
providing the difficulty term in the MDS-VQA greedy selection pipeline;
selecting samples for labeling before active fine-tuning.

This model is not a final video quality scoring model. A higher score means the video is predicted to be harder for the base VQA model to evaluate, not that the video has higher or lower perceptual quality.

Prompt Format

The failure predictor follows the prompt used by src/inference.py in MDS-VQA:

text
You are doing the video quality assessment task. Here is the question:
Assess how difficult it is to evaluate this video's quality for video quality assessment. The difficulty rating should be a float between 1 and 5, rounded to two decimal places, with 1 representing very easy to evaluate and 5 representing very difficult to evaluate.
Please only output the final answer with only one score in <answer> </answer> tags.

For automatic evaluation or selection, parse the scalar value inside the final tag.

The output is a JSON file mapping video names to predicted difficulty scores, for example:

json
{
  "example.mp4": {
    "reasoning": "N/A",
    "score": 3.0
  }
}

MDS-VQA Context

MDS-VQA selects target-domain videos using two complementary signals:

Predicted difficulty: estimated by this failure predictor, which identifies videos likely to expose errors of the base VQA model. Content diversity: computed from semantic video features and incorporated through a greedy selection procedure. The selected videos are then labeled and merged with the source-domain training set for active fine-tuning. The resulting active fine-tuned checkpoint on YouTube-SFV SDR is available at hollow404/MDS-VQA-Active-Finetuning.

Citation

If you use this model, please cite MDS-VQA:

bibtex
@article{zou2026mds,
  title={MDS-VQA: Model-Informed Data Selection for Video Quality Assessment},
  author={Zou, Jian and Xu, Xiaoyu and Wang, Zhihua and Wang, Yilin and Adsumilli, Balu and Ma, Kede},
  journal={arXiv preprint arXiv:2603.11525},
  year={2026}
}

Model provider

hollow404

Model tree

Base

hollow404/VQR1-7B-YouTubeUGC

Adapter

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Model Details

Model type: failure predictor / video difficulty estimator for no-reference VQA
Checkpoint type: PEFT / LoRA adapter
Backbone family: Qwen2.5-VL / VisualQuality-R1-style VLM
Base model: hollow404/VQR1-7B-YouTubeUGC
LoRA rank: 64
LoRA alpha: 128
LoRA dropout: 0.05
Training data: YouTube-UGC with base-model prediction scores/errors
Input: a video plus a failure-prediction prompt
Output: a scalar difficulty score on a 1 to 5 scale, typically inside <answer>...</answer> tags
License: Apache 2.0

Intended Use

This checkpoint is intended for research on model-informed data selection for video quality assessment. Typical uses include:

estimating which target-domain videos are difficult for the base VQA model;
ranking an unlabeled video pool by predicted failure/difficulty;
providing the difficulty term in the MDS-VQA greedy selection pipeline;
selecting samples for labeling before active fine-tuning.

Prompt Format

The failure predictor follows the prompt used by src/inference.py in MDS-VQA:

text
You are doing the video quality assessment task. Here is the question:
Assess how difficult it is to evaluate this video's quality for video quality assessment. The difficulty rating should be a float between 1 and 5, rounded to two decimal places, with 1 representing very easy to evaluate and 5 representing very difficult to evaluate.
Please only output the final answer with only one score in <answer> </answer> tags.

For automatic evaluation or selection, parse the scalar value inside the final tag.

The output is a JSON file mapping video names to predicted difficulty scores, for example:

json
{
  "example.mp4": {
    "reasoning": "N/A",
    "score": 3.0
  }
}

MDS-VQA Context

MDS-VQA selects target-domain videos using two complementary signals:

Citation

If you use this model, please cite MDS-VQA:

bibtex
@article{zou2026mds,
  title={MDS-VQA: Model-Informed Data Selection for Video Quality Assessment},
  author={Zou, Jian and Xu, Xiaoyu and Wang, Zhihua and Wang, Yilin and Adsumilli, Balu and Ma, Kede},
  journal={arXiv preprint arXiv:2603.11525},
  year={2026}
}

MDS-VQA-Failure-Predictor

Get help setting up a custom Dedicated Endpoints.

README

Model Details

Intended Use

Prompt Format

MDS-VQA Context

Citation

Explore FriendliAI today

README

Model Details

Intended Use

Prompt Format

MDS-VQA Context

Citation