Base Model
Training Data
The training set was built from public receipt key-information-extraction datasets.
Table with columns: Dataset, Used Split / Count, License, Notes| Dataset | Used Split / Count | License | Notes |
|---|
jsdnrs/ICDAR2019-SROIE | 987 training examples | CC-BY-4.0 | Receipt KIE fields such as company, date, address, total |
nanonets/key_information_extraction | 691 training examples, 296 validation examples | Apache-2.0 | Receipt KIE fields such as seller, date, receipt number, tax, amount |
Dataset build summary:
Table with columns: Split, Count| Split | Count |
|---|
| Train | 1,678 |
| Validation | 296 |
| Total prepared examples | 1,974 |
The training target schema was normalized to:
{
"schema_version": "receipt_extraction.v1",
"document_type": "CARD_RECEIPT|GENERAL_RECEIPT|TAX_INVOICE|INVOICE|UNKNOWN",
"merchant_name": "string|null",
"business_no": "string|null",
"transaction_date": "YYYY-MM-DD|null",
"transaction_time": "HH:MM:SS|null",
"approval_no": "string|null",
"receipt_no": "string|null",
"card_no_masked": "string|null",
"supply_amount": "number|null",
"vat_amount": "number|null",
"total_amount": "number|null",
"currency": "string|null",
"address": "string|null",
"phone_number": "string|null",
"items": []
}
Training Run
Table with columns: Item, Value| Item | Value |
|---|
| Train samples | 1,678 |
| Eval samples | 296 |
| Epochs | 1 |
| Optimizer steps | 210 |
| Effective batch size | 8 |
| Runtime | 11,838.942 sec, about 3h 17m 19s |
| Train loss | 7.028 |
| Eval loss | 6.165 |
Preliminary Evaluation
Evaluation was run on the first 20 validation examples from nanonets/key_information_extraction.
The metric is strict field-level exact match after light scalar normalization. This is intentionally harsh for addresses and formatting-heavy fields. It is still useful for seeing whether the adapter is actually improving the fields required by ERP evidence processing.
Table with columns: Model, Valid JSON Rate, All-field Exact Match, Critical-field Exact Match, Runtime, Sec / Image| Model | Valid JSON Rate | All-field Exact Match | Critical-field Exact Match | Runtime | Sec / Image |
|---|
| Qwen2.5-VL-3B base | 95.0% | 58.9% | 70.6% | 453.0 sec | 22.7 |
| This LoRA adapter | 100.0% | 65.7% | 40.0% | 340.3 sec | 17.0 |
Critical fields used for the comparison:
merchant_name, business_no, transaction_date, receipt_no,
vat_amount, total_amount, address, phone_number
Field-Level Exact Match
Table with columns: Field, Base, LoRA| Field | Base | LoRA |
|---|
| document_type | 0% | 100% |
| merchant_name | 85% | 80% |
| business_no | 10% | 20% |
| transaction_date | 85% | 5% |
| transaction_time | 5% | 100% |
| approval_no | 80% | 100% |
Interpretation
This adapter improved strict JSON compliance and schema discipline, but the preliminary test shows that it is not yet better than the base model on several critical receipt fields.
Observed behavior:
- Better: strict JSON output, schema consistency, total amount extraction.
- Worse than base in this preliminary sample: transaction date, receipt number, VAT amount, phone number.
- Main likely cause: public bootstrap labels are sparse and not aligned with Korean ERP receipt requirements. Many important ERP fields are missing or inconsistent across source datasets.
Recommended Next Steps
For practical Korean ERP receipt extraction, use this model as a bootstrap only.
Recommended improvement path:
- Label 300-1,000 real Korean business receipt images with the target JSON schema.
- Include common receipt types: corporate card receipts, delivery receipts, fuel, meals, lodging, transport, online payment receipts.
- Add PaddleOCR-VL or another OCR pass as auxiliary input for fields that are hard to read visually.
- Re-train with stronger LoRA coverage, such as more target modules and a higher rank.
- Evaluate on a blind Korean receipt set with field-level metrics for
merchant_name, business_no, transaction_date, approval_no, supply_amount, vat_amount, and total_amount.
Loading
This repository is a PEFT LoRA adapter. Load it together with the base model:
from peft import PeftModel
from transformers import AutoProcessor, Qwen2_5_VLForConditionalGeneration
base_model = "Qwen/Qwen2.5-VL-3B-Instruct"
adapter_model = "hoin1218/receipt-qwen25vl-3b-lora"
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(base_model, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_model)
processor = AutoProcessor.from_pretrained(adapter_model)
Limitations
- Not validated for production Korean receipts yet.
- Public training data is mostly non-Korean.
- Exact extraction of business numbers, approval numbers, VAT, and date fields requires in-domain labels.
- The model can still omit visible values or produce wrong values. Use rule-based validation and human review for accounting workflows.