Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

Model details

  • Developed by: Paula Guerrero and Iker Gutierrez
  • Affiliation: University of the Basque Country (EHU)
  • Model type: LoRA adapter for HiTZ/Latxa-Qwen3-VL-8B-Instruct
  • Languages: Catalan (ca), Basque (eu)
  • Domain: Clinical translation
  • Direction: ca->eu only
  • Base model: HiTZ/Latxa-Qwen3-VL-8B-Instruct
  • Repository: pguerrero-igutierrez/Latxa-Qwen3-8B-Clinical-v1-ca-eu
  • Collection: pguerrero-igutierrez/mt-domain-adaptation-ca-eu

Sources

Intended use

This model is intended for research on Catalan-to-Basque clinical translation in low-resource settings.

Supported prompting direction:

  • ca->eu: Tradueix aquest text clínic del català al basc:\n\n{source}

Out-of-scope use

  • Medical decision-making
  • Clinical deployment without expert review
  • Any reverse direction (eu->ca)
  • Translation outside the clinical domain

Training data

The adapter was trained on backtranslated-corpus/eu-clinical_backtranslated.json, where synthetic Catalan (ca) is used as source and original Basque (eu) as target.

The corpus was built from Basque clinical documents in the E3C corpus using back-translation.

Training procedure

  • LoRA rank: 16
  • LoRA alpha: 32
  • LoRA dropout: 0.05
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Quantization: 4-bit NF4
  • Max sequence length: 768
  • Epochs: 3
  • Batch size: 4
  • Gradient accumulation: 8
  • Learning rate: 5e-5
  • Scheduler: cosine
  • Warmup ratio: 0.05
  • Seed: 42
  • Checkpoint selection: best validation BLEU

Evaluation

Results on the clinical held-out test set:

DirectionchrF++BLEUTERCOMET
ca->eu40.2019.43101.0976.25

This was the strongest model in the project on the clinical domain by chrF++, BLEU, and COMET.

Limitations

  • Only supports ca->eu
  • Trained on synthetic-source data
  • Automatic metrics do not replace expert clinical validation
  • Must not be used for diagnosis or patient care without human oversight

Usage

python

import torch
from peft import PeftModel
from transformers import AutoTokenizer, Qwen3VLForConditionalGeneration
base_id = "HiTZ/Latxa-Qwen3-VL-8B-Instruct"
adapter_id = "pguerrero-igutierrez/Latxa-Qwen3-8B-Clinical-v1-ca-eu"
tokenizer = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)
base_model = Qwen3VLForConditionalGeneration.from_pretrained(
base_id,
device_map="auto",
torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_id)
prompt = "Tradueix aquest text clínic del català al basc:\n\nEl pacient presenta febre alta."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Citation

bibtex

@misc{guerrero-gutierrez-2026-caeu-mt,
title = {Domain Adaptation for Catalan-Basque Machine Translation via Synthetic Data and Continued Fine-Tuning},
author = {Guerrero, Paula and Gutierrez, Iker},
year = {2026},
note = {Unpublished manuscript}
}

Contact

  • Paula Guerrero: pguerrero005@ikasle.ehu.eus
  • Iker Gutierrez: igutierrez134@ikasle.ehu.eus

Model provider

pguerrero-igutierrez

Model tree

Base

HiTZ/Latxa-Qwen3-VL-8B-Instruct

Adapter

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today