Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: mit

Model Description

PropertyValue
Base Modelmicrosoft/Phi-3-mini-4k-instruct
Fine-tuningQLoRA + Optuna HPO
LoRA Rankr=64, alpha=128
Training Examples250
Optuna Trials20 (TPE sampler, A100)
Optimized ForUnseen word generalisation

Performance

WordsScore
Seen words76.0%
Unseen words82.0%
Overall79.0%
Generalisation gap-6.0% (unseen > seen)

Key Finding

Optuna was configured to maximize unseen word score. This produced a negative generalisation gap (-6%) where the model performs better on words it never saw during training.

However overall score (79.0%) is lower than v2 (89.4%), demonstrating metric-objective misalignment — optimizing for a single metric (unseen) hurt the overall performance.

Lesson: HPO objective should be (seen + unseen) / 2 not just unseen score alone.

Best Config Found by Optuna

ParameterValue
Learning rate2.36e-4
Epochs32
LoRA rank64
LoRA alpha128
Quantization4-bit

Recommended Version

For production use, v2 achieves higher overall score (89.4% vs 79.0%). v3 is useful as a research artifact demonstrating generalisation vs accuracy trade-offs in HPO.

Links

Model provider

ninadp

Model tree

Base

microsoft/Phi-3-mini-4k-instruct

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today