Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

qwen35-4b-iconclass-grpo-v5full

GRPO reward-ablation checkpoint (Qwen3.5-4B-VL, from davanstrien/qwen35-4b-iconclass-vlm). Part of an experiment testing whether a richer reward bundle beats plain hierarchical-F1 (gt_match) for iconclass classification.

  • Reward config: recall + validity + count + diversity
  • Result (completeness-corrected H-F1, 40-image test): 61.8%
  • Verdict: no improvement over plain gt_match (all variants 61–64%, within n=40 noise). Reward tuning is not the lever — the model is capability-bound. The approach that worked is anchored fusion (see qwen35-4b-iconclass-sft-brillfull).

Base: davanstrien/qwen35-4b-iconclass-vlm. Trained with Unsloth + TRL.

Model provider

davanstrien

davanstrien

Model tree

Base

davanstrien/qwen35-4b-iconclass-vlm

Fine-tuned

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today