Laborator

ai-numismatist-qwen3vl-2b-coins-lora

README

License: apache-2.0

What it is

A LoRA adapter on top of Qwen/Qwen3-VL-2B-Instruct. The base model is unchanged; this is a small set of trained low-rank weights you load via PEFT.

Training

Method: QLoRA (4-bit nf4, double-quant)
LoRA: r=16, alpha=16, dropout=0.05
Target modules: attention (q_proj, k_proj, v_proj, o_proj) and MLP (up_proj, gate_proj, down_proj) projections
Epochs: 3
Hardware: single NVIDIA RTX 3090
Trainable params: ~17.4M (~0.81% of the base model)
Loss: trended from ~7 to ~5 over training
Training set: 1707 CC0 public-domain coin images, all struck on/before 1925

Data sources

Table with columns: Source, Coins, License
Source	Coins	License
The Metropolitan Museum of Art	1188	CC0 1.0
Cleveland Museum of Art	519	CC0 1.0
Total	1707

Text reference: Nomisma.org core concepts (nomisma.org/id/*), CC BY 3.0.

Every image is CC0 public-domain and struck on/before 1925, which clears modern national-mint design copyright worldwide.

Intended use

Educational coin identification and description. Useful for collectors, students, archaeologists, and anyone trying to put a name to a coin in a drawer or a find tray.

Not a numismatic appraisal. This adapter does not estimate market value, condition grade, or authenticity. For valuation, consult a qualified numismatist.

Limitations

Exact dates and ruler attributions are approximate. The model is reliable on coin type, culture, and broad era; it should not be trusted for narrow-window dating without expert review.
This is a proof of concept on 1707 reference coins, not a finished catalogue. Visually similar coins outside the reference distribution will be misidentified.
Coverage skews to the cultures present in the Met and Cleveland CC0 sets: Oriental, Greek, Medieval, Byzantine, Roman, Modern (on/before 1925), Indian, Persian, East Asian.
Inherits the limitations of the base Qwen/Qwen3-VL-2B-Instruct model.

How to use

python
import torch
from transformers import AutoProcessor, Qwen3VLForConditionalGeneration, BitsAndBytesConfig
from peft import PeftModel
from PIL import Image

BASE = "Qwen/Qwen3-VL-2B-Instruct"
ADAPTER = "Laborator/ai-numismatist-qwen3vl-2b-coins-lora"

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

processor = AutoProcessor.from_pretrained(BASE)
model = Qwen3VLForConditionalGeneration.from_pretrained(
    BASE, quantization_config=bnb, device_map="auto", torch_dtype=torch.bfloat16,
)
model = PeftModel.from_pretrained(model, ADAPTER)
model.eval()

image = Image.open("coin.jpg").convert("RGB")
messages = [{"role": "user", "content": [
    {"type": "image", "image": image},
    {"type": "text", "text": "Identify this coin. State its type, culture, and approximate date."},
]}]

inputs = processor.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=True,
    return_dict=True, return_tensors="pt",
).to(model.device)

out = model.generate(**inputs, max_new_tokens=200, do_sample=False)
print(processor.batch_decode(out[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)[0])

License

This adapter is released under Apache 2.0, inheriting the license of the base model.

Full source, training script, and inference pipeline: https://github.com/SergheiBrinza/ai-numismatist

Credits

Base model: Qwen/Qwen3-VL-2B-Instruct by the Qwen Team / Alibaba — Apache 2.0
Recognition model in the pipeline: Coin-CLIP by Breezedeus — Apache 2.0
Reference images: The Metropolitan Museum of Art and the Cleveland Museum of Art — CC0 1.0
Numismatic concept graph: Nomisma.org — CC BY 3.0

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Model Details

Model Provider

Laborator

Model Tree

Base

Qwen/Qwen3-VL-2B-Instruct

Adapter

this model

Input Modalities

Text

Image

Output Modalities