hoin1218

receipt-qwen25vl-3b-lora

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Base Model

Base: Qwen/Qwen2.5-VL-3B-Instruct
Adapter type: LoRA / PEFT
Training method: SFT with TRL

Training Data

The training set was built from public receipt key-information-extraction datasets.

Table with columns: Dataset, Used Split / Count, License, Notes
Dataset	Used Split / Count	License	Notes
`jsdnrs/ICDAR2019-SROIE`	987 training examples	CC-BY-4.0	Receipt KIE fields such as company, date, address, total
`nanonets/key_information_extraction`	691 training examples, 296 validation examples	Apache-2.0	Receipt KIE fields such as seller, date, receipt number, tax, amount

Dataset build summary:

Table with columns: Split, Count
Split	Count
Train	1,678
Validation	296
Total prepared examples	1,974

The training target schema was normalized to:

json
{
  "schema_version": "receipt_extraction.v1",
  "document_type": "CARD_RECEIPT|GENERAL_RECEIPT|TAX_INVOICE|INVOICE|UNKNOWN",
  "merchant_name": "string|null",
  "business_no": "string|null",
  "transaction_date": "YYYY-MM-DD|null",
  "transaction_time": "HH:MM:SS|null",
  "approval_no": "string|null",
  "receipt_no": "string|null",
  "card_no_masked": "string|null",
  "supply_amount": "number|null",
  "vat_amount": "number|null",
  "total_amount": "number|null",
  "currency": "string|null",
  "address": "string|null",
  "phone_number": "string|null",
  "items": []
}

Training Run

Table with columns: Item, Value
Item	Value
Train samples	1,678
Eval samples	296
Epochs	1
Optimizer steps	210
Effective batch size	8
Runtime	11,838.942 sec, about 3h 17m 19s
Train loss	7.028
Eval loss	6.165

Preliminary Evaluation

Evaluation was run on the first 20 validation examples from nanonets/key_information_extraction.

The metric is strict field-level exact match after light scalar normalization. This is intentionally harsh for addresses and formatting-heavy fields. It is still useful for seeing whether the adapter is actually improving the fields required by ERP evidence processing.

Table with columns: Model, Valid JSON Rate, All-field Exact Match, Critical-field Exact Match, Runtime, Sec / Image
Model	Valid JSON Rate	All-field Exact Match	Critical-field Exact Match	Runtime	Sec / Image
Qwen2.5-VL-3B base	95.0%	58.9%	70.6%	453.0 sec	22.7
This LoRA adapter	100.0%	65.7%	40.0%	340.3 sec	17.0

Critical fields used for the comparison:

text
merchant_name, business_no, transaction_date, receipt_no,
vat_amount, total_amount, address, phone_number

Field-Level Exact Match

Table with columns: Field, Base, LoRA
Field	Base	LoRA
document_type	0%	100%
merchant_name	85%	80%
business_no	10%	20%
transaction_date	85%	5%
transaction_time	5%	100%
approval_no	80%	100%

Interpretation

This adapter improved strict JSON compliance and schema discipline, but the preliminary test shows that it is not yet better than the base model on several critical receipt fields.

Observed behavior:

Better: strict JSON output, schema consistency, total amount extraction.
Worse than base in this preliminary sample: transaction date, receipt number, VAT amount, phone number.
Main likely cause: public bootstrap labels are sparse and not aligned with Korean ERP receipt requirements. Many important ERP fields are missing or inconsistent across source datasets.

Recommended Next Steps

For practical Korean ERP receipt extraction, use this model as a bootstrap only.

Recommended improvement path:

Label 300-1,000 real Korean business receipt images with the target JSON schema.
Include common receipt types: corporate card receipts, delivery receipts, fuel, meals, lodging, transport, online payment receipts.
Add PaddleOCR-VL or another OCR pass as auxiliary input for fields that are hard to read visually.
Re-train with stronger LoRA coverage, such as more target modules and a higher rank.
Evaluate on a blind Korean receipt set with field-level metrics for merchant_name, business_no, transaction_date, approval_no, supply_amount, vat_amount, and total_amount.

Loading

This repository is a PEFT LoRA adapter. Load it together with the base model:

python
from peft import PeftModel
from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration

base_model = "Qwen/Qwen2.5-VL-3B-Instruct"
adapter_model = "hoin1218/receipt-qwen25vl-3b-lora"

model = Qwen2_5_VLForConditionalGeneration.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_model)
processor = AutoProcessor.from_pretrained(adapter_model)

Limitations

Not validated for production Korean receipts yet.
Public training data is mostly non-Korean.
Exact extraction of business numbers, approval numbers, VAT, and date fields requires in-domain labels.
The model can still omit visible values or produce wrong values. Use rule-based validation and human review for accounting workflows.

Model provider

hoin1218

Model tree

Base

Qwen/Qwen2.5-VL-3B-Instruct

Adapter

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Base Model

Base: Qwen/Qwen2.5-VL-3B-Instruct
Adapter type: LoRA / PEFT
Training method: SFT with TRL

Training Data

The training set was built from public receipt key-information-extraction datasets.

Table with columns: Dataset, Used Split / Count, License, Notes
Dataset	Used Split / Count	License	Notes
`jsdnrs/ICDAR2019-SROIE`	987 training examples	CC-BY-4.0	Receipt KIE fields such as company, date, address, total
`nanonets/key_information_extraction`	691 training examples, 296 validation examples	Apache-2.0	Receipt KIE fields such as seller, date, receipt number, tax, amount

Dataset build summary:

Table with columns: Split, Count
Split	Count
Train	1,678
Validation	296
Total prepared examples	1,974

The training target schema was normalized to:

json
{
  "schema_version": "receipt_extraction.v1",
  "document_type": "CARD_RECEIPT|GENERAL_RECEIPT|TAX_INVOICE|INVOICE|UNKNOWN",
  "merchant_name": "string|null",
  "business_no": "string|null",
  "transaction_date": "YYYY-MM-DD|null",
  "transaction_time": "HH:MM:SS|null",
  "approval_no": "string|null",
  "receipt_no": "string|null",
  "card_no_masked": "string|null",
  "supply_amount": "number|null",
  "vat_amount": "number|null",
  "total_amount": "number|null",
  "currency": "string|null",
  "address": "string|null",
  "phone_number": "string|null",
  "items": []
}

Training Run

Table with columns: Item, Value
Item	Value
Train samples	1,678
Eval samples	296
Epochs	1
Optimizer steps	210
Effective batch size	8
Runtime	11,838.942 sec, about 3h 17m 19s
Train loss	7.028
Eval loss	6.165

Preliminary Evaluation

Evaluation was run on the first 20 validation examples from nanonets/key_information_extraction.

Table with columns: Model, Valid JSON Rate, All-field Exact Match, Critical-field Exact Match, Runtime, Sec / Image
Model	Valid JSON Rate	All-field Exact Match	Critical-field Exact Match	Runtime	Sec / Image
Qwen2.5-VL-3B base	95.0%	58.9%	70.6%	453.0 sec	22.7
This LoRA adapter	100.0%	65.7%	40.0%	340.3 sec	17.0

Critical fields used for the comparison:

text
merchant_name, business_no, transaction_date, receipt_no,
vat_amount, total_amount, address, phone_number

Field-Level Exact Match

Table with columns: Field, Base, LoRA
Field	Base	LoRA
document_type	0%	100%
merchant_name	85%	80%
business_no	10%	20%
transaction_date	85%	5%
transaction_time	5%	100%
approval_no	80%	100%

Interpretation

This adapter improved strict JSON compliance and schema discipline, but the preliminary test shows that it is not yet better than the base model on several critical receipt fields.

Observed behavior:

Better: strict JSON output, schema consistency, total amount extraction.
Worse than base in this preliminary sample: transaction date, receipt number, VAT amount, phone number.
Main likely cause: public bootstrap labels are sparse and not aligned with Korean ERP receipt requirements. Many important ERP fields are missing or inconsistent across source datasets.

Recommended Next Steps

For practical Korean ERP receipt extraction, use this model as a bootstrap only.

Recommended improvement path:

Label 300-1,000 real Korean business receipt images with the target JSON schema.
Include common receipt types: corporate card receipts, delivery receipts, fuel, meals, lodging, transport, online payment receipts.
Add PaddleOCR-VL or another OCR pass as auxiliary input for fields that are hard to read visually.
Re-train with stronger LoRA coverage, such as more target modules and a higher rank.
Evaluate on a blind Korean receipt set with field-level metrics for merchant_name, business_no, transaction_date, approval_no, supply_amount, vat_amount, and total_amount.

Loading

This repository is a PEFT LoRA adapter. Load it together with the base model:

python
from peft import PeftModel
from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration

base_model = "Qwen/Qwen2.5-VL-3B-Instruct"
adapter_model = "hoin1218/receipt-qwen25vl-3b-lora"

model = Qwen2_5_VLForConditionalGeneration.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_model)
processor = AutoProcessor.from_pretrained(adapter_model)

Limitations

Not validated for production Korean receipts yet.
Public training data is mostly non-Korean.
Exact extraction of business numbers, approval numbers, VAT, and date fields requires in-domain labels.
The model can still omit visible values or produce wrong values. Use rule-based validation and human review for accounting workflows.

receipt-qwen25vl-3b-lora

Get help setting up a custom Dedicated Endpoints.

README

Base Model

Training Data

Training Run

Preliminary Evaluation

Field-Level Exact Match

Interpretation

Recommended Next Steps

Loading

Limitations

Explore FriendliAI today

README

Base Model

Training Data

Training Run

Preliminary Evaluation

Field-Level Exact Match

Interpretation

Recommended Next Steps

Loading

Limitations