AK04-IXR
sarvam1-hinglish-g2p-lora
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: mitResults
On a held-out split it reproduces the espeak-ng reference with 0.00% PER / 100% exact phoneme match (n=60) — i.e. it generalizes the phonemizer's deterministic mapping to unseen sentences.
Usage
python
from peft import PeftModelfrom transformers import AutoModelForCausalLM, AutoTokenizertok = AutoTokenizer.from_pretrained("sarvamai/sarvam-1")m = AutoModelForCausalLM.from_pretrained("sarvamai/sarvam-1")m = PeftModel.from_pretrained(m, "AK04-IXR/sarvam1-hinglish-g2p-lora")prompt = "Input: Mera flight ticket pee-en-aar eight three nine two hai.\nOutput:"ids = tok(prompt, return_tensors="pt").to(m.device)out = m.generate(**ids, max_new_tokens=160, do_sample=False)print(tok.decode(out[0][ids['input_ids'].shape[1]:], skip_special_tokens=True))
Training
LoRA (r=16, α=32; 0.94% of params) on ~7k (text → IPA) pairs phonemized by espeak-ng (en-us), 3 epochs, bf16, single A100.
Limitations
Distilled from espeak-ng, so it matches (does not surpass) that reference; trained on Latin-script normalized text (Devanagari-carrier lines held out), and code-switched phonemization (per-span language ID) remains an open problem.
Model provider
AK04-IXR
Model tree
Base
sarvamai/sarvam-1
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information