Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Table of Contents

Model Description

PropertyValue
Base Modelgplsi/Aitana-2B-SI-Instruct
ArchitectureTransformer decoder-only
Parameters~2.25B
LanguagesValencian, Spanish, English
LicenseApache 2.0

Aitana-2B-SI-Instruct-Aligned extends the Aitana-2B-SI-Instruct instruction-tuned model with Direct Preference Optimization (DPO) alignment. This additional training stage improves the model's ability to generate helpful, high-quality responses that better align with human preferences while maintaining its strong multilingual capabilities.

Alignment Details

The model was aligned using Direct Preference Optimization (DPO) with the following configuration:

HyperparameterValue
MethodDPO (Direct Preference Optimization)
Learning rate5e-6
Epochs1
Beta0.1
LR SchedulerLinear
Total Samples146,180
English Samples80,308
Spanish Samples30,072
Valencian Samples35,800
LanguagesSpanish, Valencian, English

The DPO alignment was performed using curated preference pairs that teach the model to prefer more helpful, accurate, and well-structured responses.

Training Data

The base instruction model was trained on the ALIA Instruction/v12 dataset. This DPO-aligned variant was further aligned using the Alignment/v8 dataset, composed of the following preference data:

Dataset IDNameLanguagesSource
al1HelpSteer3EN, ESnvidia/HelpSteer3
al2OpenAssistant1 (OASST1)EN, ES, RU (+32 more)OpenAssistant/oasst1
al3OpenAssistant2 (OASST2)EN, ES, RU (+32 more)OpenAssistant/oasst2
al4OpenOrcaENOpen-Orca/OpenOrca
al5OASST2 ValencianoVA

The alignment data focused on English, Spanish, and Valencian preference pairs, with the distribution: 80,308 English, 30,072 Spanish, and 35,800 Valencian samples.

Intended Uses

This model can be used for:

  • Instruction following in Valencian, Spanish, and English with improved alignment to human preferences
  • Chat and conversational applications requiring high-quality multilingual responses
  • Text generation with task-specific prompting and improved output quality
  • Domain-specific applications in administrative, legal, or tourism contexts

Note: As an aligned instruction-tuned model, it is designed to follow user prompts and generate helpful, safe responses. It is not intended for use as a factual knowledge base. The DPO alignment improves response quality and preference alignment.

How to Use

Transformers

python

import torch
from transformers import pipeline, AutoTokenizer
model_id = "gplsi/Aitana-2B-SI-Instruct-Aligned"
tokenizer = AutoTokenizer.from_pretrained(model_id)
generator = pipeline(
"text-generation",
model=model_id,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
device_map="auto",
)
# Valencian example
text = "Explica què són les Corts Valencianes i quina funció tenen."
result = generator(text, do_sample=True, top_k=10, max_new_tokens=100)
print(result[0]['generated_text'])
# Spanish example
text = "Describe las principales funciones del gobierno autonómico valenciano."
result = generator(text, do_sample=True, top_k=10, max_new_tokens=100)
print(result[0]['generated_text'])
# English example
text = "Explain the role of tourism in the Valencian Community economy."
result = generator(text, do_sample=True, top_k=10, max_new_tokens=100)
print(result[0]['generated_text'])

Evaluation

In the following tables, we present the results obtained with different benchmarks from lm-evaluation-harness in comparison with Salamandra-2B-Instruct, and Aitana-2B-S-Instruct-Aligned. The results reflect the DPO-aligned instruction-tuned performance.

Valencian

Classification Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)Aitana-2B-SI-Instruct-Aligned (v0.1)
XNLIvaNatural Language Inferenceacc0.5200.5140.485

Generation Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)Aitana-2B-SI-Instruct-Aligned (v0.1)
CocoterosvaReading Comprehensionbleu2.7963.6124.223
Phrases ca-vava-caTranslation - Adaptationbleu58.42574.53868.305
Phrases va-cava-caTranslation - Adaptationbleu70.66071.69169.551
Phrases va-esva-esTranslationbleu65.42772.09770.061
Phrases es-vaes-vaTranslationbleu45.68856.01254.053
Truthfulqa_vavaTruthfulnessbleu_acc0.4090.3940.383

Catalan

Classification Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)Aitana-2B-SI-Instruct-Aligned (v0.1)
Belebele Cat_latncaReading Comprehensionacc0.2870.2480.319
COPAcaCommonsense Reasoningacc0.7080.7260.694
XStoryClozecaCommonsense Reasoningacc0.6160.6290.623
OpenBookQAcaQuestion Answeringacc0.2960.2960.326
PAWScaParaphrasingacc0.6020.5980.531
PiQAcaQuestion Answeringacc0.6380.6550.629
ARC EasycaQuestion Answeringacc0.5160.5240.526
ARC ChallengecaQuestion Answeringacc0.2980.3140.310
XNLIcaNatural Language Inferenceacc0.5130.5150.497
TecacaNatural Language Inferenceacc0.4860.5000.468
WNLIcaNatural Language Inferenceacc0.5630.4370.436
CatcolacaLinguistic Acceptabilityacc0.4920.7130.680
CatcolacaLinguistic Acceptabilitymcc0.097-0.0400.013
CatalanqacaQuestion AnsweringF10.5160.3840.396
Mgsm directcaMathexact match0.0000.0120.004
CatalanqacaQuestion Answeringexact match0.1820.0110.031
XquadcaQuestion Answeringexact match0.1030.0140.037
XquadcaQuestion AnsweringF10.3940.2870.317

Generation Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)Aitana-2B-SI-Instruct-Aligned (v0.1)
Cabreu abstractivecaSummarizationbleu7.6107.7038.837
Cabreu extractivecaSummarizationbleu38.00219.87628.16803
Cabreu extremecaSummarizationbleu2.7333.2453.386

Spanish

Classification Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)Aitana-2B-SI-Instruct-Aligned (v0.1)
BelebeleesReading Comprehensionacc0.2680.2440.285
PAWSesParaphrasingacc0.5660.6180.546
XNLIesNatural Language Inferenceacc0.4630.4390.443
WNLIesNatural Language Inferenceacc0.4790.5350.535
XStoryClozeesCommonsense Reasoningacc0.6170.6280.632
EscolaesLinguistic Acceptabilityacc0.2930.7080.654
EscolaesLinguistic Acceptabilitymcc0.0200.0000.046
OpenbookQAesQuestion Answeringacc0.2860.3380.332
MGSM DirectesMathexact match0.0200.0240.1
XQUADesQuestion Answeringexact match0.0660.0260.019
XQUADesQuestion AnsweringF10.3550.2930.293

Generation Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)Aitana-2B-SI-Instruct-Aligned (v0.1)
CocoterosesReading Comprehensionbleu3.3083.1413.670
XLSumesSummarizationbleu1.6951.7371.971

English

Classification Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)Aitana-2B-SI-Instruct-Aligned (v0.1)
Arc ChallengeenQuestion Answeringacc0.3540.3630.372
Arc EasyenQuestion Answeringacc0.6810.7090.682
BelebeleenReading Comprehensionacc0.2600.2930.349
PAWSenParaphrasingacc0.5970.5940.555
XNLIenNatural Language Inferenceacc0.5120.5530.480
XStoryClozeenCommonsense Reasoningacc0.6620.6800.693
OpenBookQAenQuestion Answeringacc0.2980.3380.316
PiQAenQuestion Answeringacc0.7150.7170.704
Social iqaenQuestion Answeringacc0.4530.4510.468
WNLIenNatural Language Inferenceacc0.5350.4650.451
MGSM DirectenMathexact match0.0080.0520.116
TriviaQAenQuestion Answeringexact match0.0760.1470.156

Judge Evaluation

The model was also evaluated using an LLM-as-judge approach across different task categories. The scores below represent the average rating (1-5 scale, 5 being best) and standard deviation for each task category, comparing Aitana-2B-SI-Instruct-Aligned against Salamandra-2B-Instruct and Aitana-2B-S-Instruct-Aligned.

Task CategorySalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)Aitana-2B-SI-Instruct-Aligned (v0.1)
CommonSense reasoning2.277 / 1.1512.737 / 1.1402.969 / 1.086
Maths1.060 / 0.1241.123 / 0.2491.191 / 0.349
Paraphrasing3.518 / 1.3083.460 / 1.0883.472 / 0.959
Reading comprehension2.966 / 1.1112.894 / 1.3113.112 / 1.146
Summarization2.217 / 1.0682.261 / 0.8202.591 / 1.115
Translation3.557 / 0.7603.418 / 0.9993.390 / 0.730
Overall Avg2.599 / 0.9202.649 / 0.9352.787 / 0.897

The DPO-aligned model shows a notable improvement in overall average score (2.787) compared to Aitana-2B-S-Instruct-Aligned (v0.1) (2.649) and Salamandra-2B-Instruct (2.599) with particular gains in CommonSense reasoning, reading comprehension and summarization. The aligned model also shows tighter standard deviations in several categories, indicating more consistent quality responses.

Additional Information

Author

The model has been developed by the Language and Information Systems Group (GPLSI) and the Centro de Inteligencia Digital (CENID), both part of the University of Alicante (UA), as part of their ongoing research in Natural Language Processing (NLP).

Funding

This work is funded by the Ministerio para la Transformación Digital y de la Función Pública, co-financed by the EU – NextGenerationEU, within the framework of the project Desarrollo de Modelos ALIA. This work has also been partially supported by Project HEART-NLP (PID2024-156263OB-C22).

Acknowledgments

We would like to express our gratitude to all individuals and institutions that have contributed to the development of this work. Special thanks to:

We also acknowledge the financial, technical, and scientific support of the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of the project Desarrollo de Modelos ALIA, whose contribution has been essential to the completion of this research.

License

Apache License, Version 2.0

Disclaimer

This model is intended for general purposes and is available under a permissive Apache License 2.0. Be aware that the model may have biases and/or undesirable outputs. Users deploying systems based on this model are responsible for mitigating risks and complying with applicable AI regulations.

Reference

bibtex

@misc{gplsi-aitana-2B-SI-Instruct-Aligned,
author = {Galiano, Santiago and Sepúlveda-Torres, Robiert and Martínez-Murillo, Iván and Grande, Eduardo and Consuegra-Ayala, Juan Pablo and Miró Maestre, María and Canal-Esteve, Miquel and Bonora, Mar and Gutierrez, Yoan and Abreu Salas, José Ignacio and Lloret, Elena and Montoyo, Andrés and Muñoz-Guillena, Rafael and Palomar, Manuel},
title = {Aitana 2B SI Instruct Aligned: DPO-aligned instruction-tuned model for Valencian, Spanish and English},
year = {2026},
institution = {Language and Information Systems Group (GPLSI) and Centro de Inteligencia Digital (CENID), University of Alicante (UA)},
howpublished = {\url{https://huggingface.co/gplsi/Aitana-2B-SI-Instruct-Aligned}},
note = {Accessed: 2026-05-11}
}

Copyright © 2026 Language and Information Systems Group (GPLSI) and Centro de Inteligencia Digital (CENID), University of Alicante (UA). Distributed under the Apache License 2.0.

Model provider

gplsi

gplsi

Model tree

Base

gplsi/Aitana-2B-SI-Instruct

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today