Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Table of Contents

Model Description

PropertyValue
Base Modelgplsi/Aitana-2B-S-Instruct
ArchitectureTransformer decoder-only
Parameters~2.25B
LanguagesValencian, Spanish, English
LicenseApache 2.0

Aitana-2B-S-Instruct-Aligned extends the Aitana-2B-S-Instruct instruction-tuned model with Direct Preference Optimization (DPO) alignment. This additional training stage improves the model's ability to generate helpful, high-quality responses that better align with human preferences while maintaining its strong multilingual capabilities.

Alignment Details

The model was aligned using Direct Preference Optimization (DPO) with the following configuration:

HyperparameterValue
MethodDPO (Direct Preference Optimization)
Learning rate5e-6
Epochs1
Beta0.1
LR SchedulerLinear
Total Samples146,180
English Samples80,308
Spanish Samples30,072
Valencian Samples35,800
LanguagesSpanish, Valencian, English

The DPO alignment was performed using curated preference pairs that teach the model to prefer more helpful, accurate, and well-structured responses.

Training Data

The base instruction model was trained on the ALIA Instruction/v12 dataset. This DPO-aligned variant was further aligned using the Alignment/v8 dataset, composed of the following preference data:

Dataset IDNameLanguagesSource
al1HelpSteer3EN, ESnvidia/HelpSteer3
al2OpenAssistant1 (OASST1)EN, ES, RU (+32 more)OpenAssistant/oasst1
al3OpenAssistant2 (OASST2)EN, ES, RU (+32 more)OpenAssistant/oasst2
al4OpenOrcaENOpen-Orca/OpenOrca
al5OASST2 ValencianoVA

The alignment data focused on English, Spanish, and Valencian preference pairs, with the distribution: 80,308 English, 30,072 Spanish, and 35,800 Valencian samples.

Intended Uses

This model can be used for:

  • Instruction following in Valencian, Spanish, and English with improved alignment to human preferences
  • Chat and conversational applications requiring high-quality multilingual responses
  • Text generation with task-specific prompting and improved output quality
  • Domain-specific applications in administrative, legal, or tourism contexts

Note: As an aligned instruction-tuned model, it is designed to follow user prompts and generate helpful, safe responses. It is not intended for use as a factual knowledge base. The DPO alignment improves response quality and preference alignment.

How to Use

Transformers

python

import torch
from transformers import pipeline, AutoTokenizer
model_id = "gplsi/Aitana-2B-S-Instruct-Aligned"
tokenizer = AutoTokenizer.from_pretrained(model_id)
generator = pipeline(
"text-generation",
model=model_id,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
device_map="auto",
)
# Valencian example
text = "Explica què són les Corts Valencianes i quina funció tenen."
result = generator(text, do_sample=True, top_k=10, max_new_tokens=100)
print(result[0]['generated_text'])
# Spanish example
text = "Describe las principales funciones del gobierno autonómico valenciano."
result = generator(text, do_sample=True, top_k=10, max_new_tokens=100)
print(result[0]['generated_text'])
# English example
text = "Explain the role of tourism in the Valencian Community economy."
result = generator(text, do_sample=True, top_k=10, max_new_tokens=100)
print(result[0]['generated_text'])

Evaluation

In the following tables, we present the results obtained with different benchmarks from lm-evaluation-harness in comparison with Salamandra-2B-Instruct. The results reflect the DPO-aligned instruction-tuned performance.

Valencian

Classification Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)
XNLIvaNatural Language Inferenceacc0.5200.514

Generation Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)
CocoterosvaReading Comprehensionbleu2.7963.612
Phrases ca-vava-caTranslation - Adaptationbleu58.42574.538
Phrases va-cava-caTranslation - Adaptationbleu70.66071.691
Phrases va-esva-esTranslationbleu65.42772.097
Phrases es-vaes-vaTranslationbleu45.68856.012
Truthfulqa_vavaTruthfulnessbleu_acc0.4090.394

Catalan

Classification Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)
Belebele Cat_latncaReading Comprehensionacc0.2870.248
COPAcaCommonsense Reasoningacc0.7080.726
XStoryClozecaCommonsense Reasoningacc0.6160.629
OpenBookQAcaQuestion Answeringacc0.2960.296
PAWScaParaphrasingacc0.6020.598
PiQAcaQuestion Answeringacc0.6380.655
ARC EasycaQuestion Answeringacc0.5160.524
ARC ChallengecaQuestion Answeringacc0.2980.314
XNLIcaNatural Language Inferenceacc0.5130.515
TecacaNatural Language Inferenceacc0.4860.500
WNLIcaNatural Language Inferenceacc0.5630.437
CatcolacaLinguistic Acceptabilityacc0.4920.713
CatcolacaLinguistic Acceptabilitymcc0.097-0.040
CatalanqacaQuestion AnsweringF10.5160.384
Mgsm directcaMathexact match0.0000.012
CatalanqacaQuestion Answeringexact match0.1820.011
XquadcaQuestion Answeringexact match0.1030.014
XquadcaQuestion AnsweringF10.3940.287

Generation Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)
Cabreu abstractivecaSummarizationbleu7.6107.703
Cabreu extractivecaSummarizationbleu38.00219.876
Cabreu extremecaSummarizationbleu2.7333.245

Spanish

Classification Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)
BelebeleesReading Comprehensionacc0.2680.244
PAWSesParaphrasingacc0.5660.618
XNLIesNatural Language Inferenceacc0.4630.439
WNLIesNatural Language Inferenceacc0.4790.535
XStoryClozeesCommonsense Reasoningacc0.6170.628
EscolaesLinguistic Acceptabilityacc0.2930.708
EscolaesLinguistic Acceptabilitymcc0.0200.000
OpenbookQAesQuestion Answeringacc0.2860.338
MGSM DirectesMathexact match0.0200.024
XQUADesQuestion Answeringexact match0.0660.026
XQUADesQuestion AnsweringF10.3550.293

Generation Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)
CocoterosesReading Comprehensionbleu3.3083.141
XLSumesSummarizationbleu1.6951.737

English

Classification Benchmarks

DatasetLang.TaskMetricSalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)
Arc ChallengeenQuestion Answeringacc0.3540.363
Arc EasyenQuestion Answeringacc0.6810.709
BelebeleenReading Comprehensionacc0.2600.293
PAWSenParaphrasingacc0.5970.594
XNLIenNatural Language Inferenceacc0.5120.553
XStoryClozeenCommonsense Reasoningacc0.6620.680
OpenBookQAenQuestion Answeringacc0.2980.338
PiQAenQuestion Answeringacc0.7150.717
Social iqaenQuestion Answeringacc0.4530.451
WNLIenNatural Language Inferenceacc0.5350.465
MGSM DirectenMathexact match0.0080.052
TriviaQAenQuestion Answeringexact match0.0760.147

Judge Evaluation

The model was also evaluated using an LLM-as-judge approach across different task categories. The scores below represent the average rating (1-5 scale, 5 being best) and standard deviation for each task category, comparing Aitana-2B-S-Instruct-Aligned-v0.1 against Salamandra-2B-Instruct.

Task CategorySalamandra-2B-InstructAitana-2B-S-Instruct-Aligned (v0.1)
CommonSense reasoning2.277 / 1.1512.737 / 1.140
Maths1.060 / 0.1241.123 / 0.249
Paraphrasing3.518 / 1.3083.460 / 1.088
Reading comprehension2.966 / 1.1112.894 / 1.311
Summarization2.217 / 1.0682.261 / 0.820
Translation3.557 / 0.7603.418 / 0.999
Overall Avg2.599 / 0.9202.649 / 0.935

The DPO-aligned model shows a notable improvement in overall average score (2.649 vs 2.599) compared to Salamandra-2B-Instruct, with particular gains in CommonSense reasoning and Maths. The aligned model also shows tighter standard deviations in several categories, indicating more consistent quality responses.

Additional Information

Author

The model has been developed by the Language and Information Systems Group (GPLSI) and the Centro de Inteligencia Digital (CENID), both part of the University of Alicante (UA), as part of their ongoing research in Natural Language Processing (NLP).

Funding

This work is funded by the Ministerio para la Transformación Digital y de la Función Pública, co-financed by the EU – NextGenerationEU, within the framework of the project Desarrollo de Modelos ALIA. This work has also been partially supported by Project HEART-NLP (PID2024-156263OB-C22).

Acknowledgments

We would like to express our gratitude to all individuals and institutions that have contributed to the development of this work. Special thanks to:

We also acknowledge the financial, technical, and scientific support of the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of the project Desarrollo de Modelos ALIA, whose contribution has been essential to the completion of this research.

License

Apache License, Version 2.0

Disclaimer

This model is intended for general purposes and is available under a permissive Apache License 2.0. Be aware that the model may have biases and/or undesirable outputs. Users deploying systems based on this model are responsible for mitigating risks and complying with applicable AI regulations.

Reference

bibtex

@misc{gplsi-aitana-2B-S-Instruct-Aligned,
author = {Galiano, Santiago and Sepúlveda-Torres, Robiert and Martínez-Murillo, Iván and Grande, Eduardo and Consuegra-Ayala, Juan Pablo and Miró Maestre, María and Canal-Esteve, Miquel and Bonora, Mar and Gutierrez, Yoan and Abreu Salas, José Ignacio and Lloret, Elena and Montoyo, Andrés and Muñoz-Guillena, Rafael and Palomar, Manuel},
title = {Aitana 2B Instruct Aligned: DPO-aligned instruction-tuned model for Valencian, Spanish and English},
year = {2026},
institution = {Language and Information Systems Group (GPLSI) and Centro de Inteligencia Digital (CENID), University of Alicante (UA)},
howpublished = {\url{https://huggingface.co/gplsi/Aitana-2B-S-Instruct-Aligned}},
note = {Accessed: 2026-05-11}
}

Copyright © 2026 Language and Information Systems Group (GPLSI) and Centro de Inteligencia Digital (CENID), University of Alicante (UA). Distributed under the Apache License 2.0.

Model provider

gplsi

gplsi

Model tree

Base

gplsi/Aitana-2B-S-Instruct

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today