Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

Model details

  • Developed by: Paula Guerrero and Iker Gutierrez
  • Affiliation: University of the Basque Country (EHU)
  • Model type: LoRA adapter for HiTZ/Latxa-Qwen3-VL-8B-Instruct
  • Languages: Catalan (ca), Basque (eu)
  • Domain: Literary translation
  • Base model: HiTZ/Latxa-Qwen3-VL-8B-Instruct
  • Repository: pguerrero-igutierrez/Latxa-Qwen3-8B-Literary-v1-ca-eu
  • Collection: pguerrero-igutierrez/mt-domain-adaptation-ca-eu

Sources

Intended use

This model is intended for literary translation research in the Catalan-Basque pair, especially when no direct in-domain parallel corpus is available.

Supported prompting directions:

  • eu->ca: Itzuli testu hau euskaratik katalanera:\n\n{source}
  • ca->eu: Tradueix aquest text del català al basc:\n\n{source}

Out-of-scope use

  • Human publication without literary post-editing
  • Translation outside the literary register
  • High-stakes or professional workflows without review

Training data

The adapter was trained on two synthetic literary corpora:

  • backtranslated-corpus/ca-literary_trilingual.json
  • backtranslated-corpus/eu-literary-EhuHac.jsonl

The EU->CA direction uses synthetic Basque as source and original Catalan as target. The CA->EU direction uses synthetic Catalan as source and original Basque as target.

Training procedure

  • LoRA rank: 16
  • LoRA alpha: 32
  • LoRA dropout: 0.05
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Quantization: 4-bit NF4
  • Max sequence length: 768
  • Epochs: 3
  • Batch size: 4
  • Gradient accumulation: 8
  • Learning rate: 5e-5
  • Scheduler: cosine
  • Warmup ratio: 0.05
  • Seed: 42
  • Checkpoint selection: best validation BLEU

Evaluation

Results on the literary held-out test set:

DirectionchrF++BLEUTERCOMET
eu->ca36.138.9685.0869.61
ca->eu26.902.4499.6065.29
Overall31.346.1291.9166.37

In the project experiments, this direct literary SFT model slightly but consistently outperformed the continued-adaptation literary variant.

Limitations

  • Uses synthetic supervision rather than human-translated in-domain CA-EU literary parallel data
  • Literary quality is only partially reflected by overlap-based metrics
  • CA->EU remains the harder literary direction in the reported experiments

Usage

python

import torch
from peft import PeftModel
from transformers import AutoTokenizer, Qwen3VLForConditionalGeneration
base_id = "HiTZ/Latxa-Qwen3-VL-8B-Instruct"
adapter_id = "pguerrero-igutierrez/Latxa-Qwen3-8B-Literary-v1-ca-eu"
tokenizer = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)
base_model = Qwen3VLForConditionalGeneration.from_pretrained(
base_id,
device_map="auto",
torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_id)
prompt = "Tradueix aquest text del català al basc:\n\nLa nit era tranquil·la."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

bibtex

@misc{guerrero-gutierrez-2026-caeu-mt,
title = {Domain Adaptation for Catalan-Basque Machine Translation via Synthetic Data and Continued Fine-Tuning},
author = {Guerrero, Paula and Gutierrez, Iker},
year = {2026},
note = {Unpublished manuscript}
}

Contact

  • Paula Guerrero: pguerrero005@ikasle.ehu.eus
  • Iker Gutierrez: igutierrez134@ikasle.ehu.eus

Model provider

pguerrero-igutierrez

Model tree

Base

HiTZ/Latxa-Qwen3-VL-8B-Instruct

Adapter

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today