guillemolivart

qwen-2b-absa-qlora

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Model Description

The model is specialized in constrained generation for structured information extraction. Given a restaurant review (in English or Spanish), it enforces a strict JSON output matching a predefined schema of 11 valid aspect categories and 4 valid polarities, effectively bypassing the need for separate heuristic parsers.

Allowed Schema

Aspect Categories: restaurant_general, food_quality, service, ambience, food_style_options, food_prices, restaurant_prices, drinks_quality, drinks_style_options, location, drinks_prices.
Sentiment Polarities: positive, negative, neutral, conflict.
Developed by: Roger Baiges Trilla & Guillem Olivart Garrofé
Language(s): Bilingual
License: Apache-2.0
Finetuned from model: Qwen/Qwen3.5-2B

Experimental Results & Performance

The model was iteratively evaluated on the devel.json dataset across the four mandatory stages of the project. The QLoRA fine-tuning stage proved to be the most robust approach to combat dataset imbalance, yielding the highest scores across all macro and micro metrics.

Main Development-Set Results

Table with columns: Configuration, Precision (Macro), F1 (Macro), F1 (Micro)
Configuration	Precision (Macro)	F1 (Macro)	F1 (Micro)
Empty baseline	0.0%	0.0%	0.0%
Top 3 majority baseline	60.4%	55.6%	56.2%
Best Zero-Shot Sweep	74.6%	67.9%	68.3%
Best Few-Shot	79.6%	73.9%	73.9%

Key Takeaway: Fine-tuning permanently alters the decision boundaries of the small language model (SLM), providing a massive +17.4% F1-macro boost over the optimized zero-shot baseline and proving that 4-bit quantization did not degrade extraction capabilities.

Training Prompt Template

To guarantee perfect reproducibility and maintain the 85.3% F1-macro performance, this model requires a highly specific constraint-enforcement prompt.

The complete schema, guidelines, and instructions are explicitly stored in the repository within the absa_prompt.json file, separated into system and user roles to maximize Qwen's instruction-following capabilities.

How to Get Started (Inference Example)

This model is a PEFT adapter. The example below automatically fetches the official absa_prompt.json prompt from this Hugging Face repository at runtime, formats the text, and runs the inference:

python
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
from huggingface_hub import hf_hub_download
import json
import torch

repo_id = "guillemolivart/qwen-2b-absa-qlora"

# 1. Download and load the prompt template from the repository files
prompt_file_path = hf_hub_download(repo_id=repo_id, filename="absa_prompt.json")
with open(prompt_file_path, "r", encoding="utf-8") as f:
    prompt_data = json.load(f)

# 2. Load the model and tokenizer in quantized 4-bit for high efficiency
model = AutoPeftModelForCausalLM.from_pretrained(
    repo_id,
    load_in_4bit=True,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(repo_id)

# 3. Define your custom multilingual review
review_text = "La comida buenísima, especialmente la carne, de precio correcto, pero los camareros tardaron una eternidad en llevarnos la cuenta."
review_language = "es"  # Supports "es" and "en"

# Format the user part of the prompt
user_content = prompt_data["user"].format(language=review_language, text=review_text)

# Build the ChatML message array using the separated system and user roles
messages = [
    {"role": "system", "content": prompt_data["system"]},
    {"role": "user", "content": user_content}
]

inputs = tokenizer.apply_chat_template(
    messages, 
    tokenize=True, 
    add_generation_prompt=True, 
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    inputs, 
    max_new_tokens=256, 
    temperature=0.1, 
    do_sample=False
)

# 4. Decode response skipping the prompt system structure
generated_tokens = outputs[0][len(inputs[0]):]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)

print(response)
# Expected Output: {"food_quality": "positive", "food_prices": "neutral", "service": "negative"}

Model provider

guillemolivart

Model tree

Base

Qwen/Qwen3.5-2B

Adapter

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Model Description

Allowed Schema

Aspect Categories: restaurant_general, food_quality, service, ambience, food_style_options, food_prices, restaurant_prices, drinks_quality, drinks_style_options, location, drinks_prices.
Sentiment Polarities: positive, negative, neutral, conflict.
Developed by: Roger Baiges Trilla & Guillem Olivart Garrofé
Language(s): Bilingual
License: Apache-2.0
Finetuned from model: Qwen/Qwen3.5-2B

Experimental Results & Performance

Main Development-Set Results

Table with columns: Configuration, Precision (Macro), F1 (Macro), F1 (Micro)
Configuration	Precision (Macro)	F1 (Macro)	F1 (Micro)
Empty baseline	0.0%	0.0%	0.0%
Top 3 majority baseline	60.4%	55.6%	56.2%
Best Zero-Shot Sweep	74.6%	67.9%	68.3%
Best Few-Shot	79.6%	73.9%	73.9%

Training Prompt Template

To guarantee perfect reproducibility and maintain the 85.3% F1-macro performance, this model requires a highly specific constraint-enforcement prompt.

How to Get Started (Inference Example)

This model is a PEFT adapter. The example below automatically fetches the official absa_prompt.json prompt from this Hugging Face repository at runtime, formats the text, and runs the inference:

python
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
from huggingface_hub import hf_hub_download
import json
import torch

repo_id = "guillemolivart/qwen-2b-absa-qlora"

# 1. Download and load the prompt template from the repository files
prompt_file_path = hf_hub_download(repo_id=repo_id, filename="absa_prompt.json")
with open(prompt_file_path, "r", encoding="utf-8") as f:
    prompt_data = json.load(f)

# 2. Load the model and tokenizer in quantized 4-bit for high efficiency
model = AutoPeftModelForCausalLM.from_pretrained(
    repo_id,
    load_in_4bit=True,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(repo_id)

# 3. Define your custom multilingual review
review_text = "La comida buenísima, especialmente la carne, de precio correcto, pero los camareros tardaron una eternidad en llevarnos la cuenta."
review_language = "es"  # Supports "es" and "en"

# Format the user part of the prompt
user_content = prompt_data["user"].format(language=review_language, text=review_text)

# Build the ChatML message array using the separated system and user roles
messages = [
    {"role": "system", "content": prompt_data["system"]},
    {"role": "user", "content": user_content}
]

inputs = tokenizer.apply_chat_template(
    messages, 
    tokenize=True, 
    add_generation_prompt=True, 
    return_tensors="pt"
).to("cuda")

outputs = model.generate(
    inputs, 
    max_new_tokens=256, 
    temperature=0.1, 
    do_sample=False
)

# 4. Decode response skipping the prompt system structure
generated_tokens = outputs[0][len(inputs[0]):]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)

print(response)
# Expected Output: {"food_quality": "positive", "food_prices": "neutral", "service": "negative"}

qwen-2b-absa-qlora

Get help setting up a custom Dedicated Endpoints.

README

Model Description

Allowed Schema

Experimental Results & Performance

Main Development-Set Results

Training Prompt Template

How to Get Started (Inference Example)

Explore FriendliAI today

README

Model Description

Allowed Schema

Experimental Results & Performance

Main Development-Set Results

Training Prompt Template

How to Get Started (Inference Example)