Laborator

ai-numismatist-qwen3vl-2b-coins-lora

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

What it is

A LoRA adapter on top of Qwen/Qwen3-VL-2B-Instruct. The base model is unchanged; this is a small set of trained low-rank weights you load via PEFT.

Training

  • Method: QLoRA (4-bit nf4, double-quant)
  • LoRA: r=16, alpha=16, dropout=0.05
  • Target modules: attention (q_proj, k_proj, v_proj, o_proj) and MLP (up_proj, gate_proj, down_proj) projections
  • Epochs: 3
  • Hardware: single NVIDIA RTX 3090
  • Trainable params: ~17.4M (~0.81% of the base model)
  • Loss: trended from ~7 to ~5 over training
  • Training set: 1707 CC0 public-domain coin images, all struck on/before 1925

Data sources

Table
SourceCoinsLicense
The Metropolitan Museum of Art1188CC0 1.0
Cleveland Museum of Art519CC0 1.0
Total1707

Text reference: Nomisma.org core concepts (nomisma.org/id/*), CC BY 3.0.

Every image is CC0 public-domain and struck on/before 1925, which clears modern national-mint design copyright worldwide.

Intended use

Educational coin identification and description. Useful for collectors, students, archaeologists, and anyone trying to put a name to a coin in a drawer or a find tray.

Not a numismatic appraisal. This adapter does not estimate market value, condition grade, or authenticity. For valuation, consult a qualified numismatist.

Limitations

  • Exact dates and ruler attributions are approximate. The model is reliable on coin type, culture, and broad era; it should not be trusted for narrow-window dating without expert review.
  • This is a proof of concept on 1707 reference coins, not a finished catalogue. Visually similar coins outside the reference distribution will be misidentified.
  • Coverage skews to the cultures present in the Met and Cleveland CC0 sets: Oriental, Greek, Medieval, Byzantine, Roman, Modern (on/before 1925), Indian, Persian, East Asian.
  • Inherits the limitations of the base Qwen/Qwen3-VL-2B-Instruct model.

How to use

python

import torch
from transformers import AutoProcessor, Qwen3VLForConditionalGeneration, BitsAndBytesConfig
from peft import PeftModel
from PIL import Image
BASE = "Qwen/Qwen3-VL-2B-Instruct"
ADAPTER = "Laborator/ai-numismatist-qwen3vl-2b-coins-lora"
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
processor = AutoProcessor.from_pretrained(BASE)
model = Qwen3VLForConditionalGeneration.from_pretrained(
BASE, quantization_config=bnb, device_map="auto", torch_dtype=torch.bfloat16,
)
model = PeftModel.from_pretrained(model, ADAPTER)
model.eval()
image = Image.open("coin.jpg").convert("RGB")
messages = [{"role": "user", "content": [
{"type": "image", "image": image},
{"type": "text", "text": "Identify this coin. State its type, culture, and approximate date."},
]}]
inputs = processor.apply_chat_template(
messages, add_generation_prompt=True, tokenize=True,
return_dict=True, return_tensors="pt",
).to(model.device)
out = model.generate(**inputs, max_new_tokens=200, do_sample=False)
print(processor.batch_decode(out[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)[0])

License

This adapter is released under Apache 2.0, inheriting the license of the base model.

Full source, training script, and inference pipeline: https://github.com/SergheiBrinza/ai-numismatist

Credits

  • Base model: Qwen/Qwen3-VL-2B-Instruct by the Qwen Team / Alibaba — Apache 2.0
  • Recognition model in the pipeline: Coin-CLIP by Breezedeus — Apache 2.0
  • Reference images: The Metropolitan Museum of Art and the Cleveland Museum of Art — CC0 1.0
  • Numismatic concept graph: Nomisma.org — CC BY 3.0

Model provider

Laborator

Model tree

Base

Qwen/Qwen3-VL-2B-Instruct

Adapter

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today