Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Test accuracy (relaxed, %)

SplitAccuracy
human78.24%
augmented91.60%
avg84.92%

Strategy ranking on Gemma 4 (avg): MD-CO (T=3) 84.92 > MFT 84.80 > DT 82.60 > FT 79.16

Key training detail

  • Strategy: mdco — L = α·CE + (1−α)·(KL_OCR + KL_QA), sequence-level KD
  • Temperature T = 3.0 (vs. default 1.0). Critical for stronger students: a sharp (T=1) teacher distribution gives a capable student like Gemma 4 nothing to learn; softening it lets KD help.
  • α = 0.8, 4-bit QLoRA (nf4 + double quant), LoRA r=16/α=16, seq_mode="mean"
  • Teacher (soft targets only): GPT-4.1

Usage

python

from transformers import AutoProcessor, AutoModelForImageTextToText
from peft import PeftModel
import torch
base_id = "google/gemma-4-E4B-it"
model = AutoModelForImageTextToText.from_pretrained(base_id, torch_dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, "UrbanAI-EH/md-co-chartqa-gemma4_mdco_human")
processor = AutoProcessor.from_pretrained(base_id)

Citation

bibtex

@article{go2026mdco,
title={MD-CO: A Knowledge Distillation Framework for Sophisticated Understanding
and Reasoning in Chart Question Answering},
author={Go, Young-Min and Jung, Hae Sun and Uprety, Sudan Prasad and Park, Keon Chul},
journal={International Journal on Document Analysis and Recognition},
year={2026}
}

Model provider

UrbanAI-EH

Model tree

Base

google/gemma-4-E4B-it

Adapter

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today