Faizaniqbal/KoshurAI_Tarjuma_v3 API & Inference Endpoint

Model Details


Author	Faizan Iqbal (@Faizaniqbal)
Base model	`Faizaniqbal/KoshurAI_Tarjuma_v2`
Adapter type	LoRA (QLoRA training)
Architecture	Gemma3ForCausalLM + PEFT LoRA
Languages	Kashmiri (ks · kas_Arab), English (en)
License	Apache-2.0
Training data	16,637 curated bidirectional EN↔KS sentence pairs
Training compute	Google Colab GPU

Model Tree

markdown
google/gemma-3-4b-pt
    └─ google/gemma-3-4b-it
           └─ sarvamai/sarvam-translate
                  └─ Faizaniqbal/KoshurAI_Tarjuma_v2   ← 2.8M Kashmiri pretraining
                         └─ Faizaniqbal/KoshurAI_Tarjuma_v3           ← this adapter (SFT)

Quickstart

Install

bash
pip install transformers peft accelerate bitsandbytes sentencepiece

Load & Translate

python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

BASE_MODEL = "Faizaniqbal/KoshurAI_Tarjuma_v2"
ADAPTER    = "Faizaniqbal/KoshurAI_Tarjuma_v3"

bnb_cfg = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tok = AutoTokenizer.from_pretrained(BASE_MODEL)
tok.pad_token = tok.eos_token

base  = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL, quantization_config=bnb_cfg, device_map="auto"
)
model = PeftModel.from_pretrained(base, ADAPTER)
model.eval()

def translate(text, direction="en2ks"):
    prefix = "Translate to Kashmiri: " if direction == "en2ks" else "Translate to English: "
    prompt = f"<start_of_turn>user\n{prefix}{text}<end_of_turn>\n<start_of_turn>model\n"
    inputs = tok(prompt, return_tensors="pt", truncation=True, max_length=512).to("cuda")
    with torch.no_grad():
        out = model.generate(
            **inputs,
            max_new_tokens=150,
            min_new_tokens=5,
            do_sample=False,
            repetition_penalty=1.1,
        )
    return tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True).strip()

print(translate("The dog is sleeping.", "en2ks"))
print(translate("ہونٛد چھُ شُنٛگِتھ", "ks2en"))

Training

Stage 1 — Kashmiri Pretraining (base model)

The base model (KoshurAI_Tarjuma_v2) was continually pretrained on 2.8 million tokens of Kashmiri text from publicly available sources (literature, journalism, academic texts, religious scholarship). This gave the model deep Kashmiri language knowledge.

Stage 2 — SFT for Translation (this adapter)

This LoRA adapter was trained on 16,637 curated bidirectional sentence pairs (EN↔KS + KS↔EN) to teach the model explicit translation capability.

Split	Records
Base SFT corpus (v2)	15,527
New pairs (v3)	1,110
Total	16,637

Training Configuration

Hyperparameter	Value
Base model	`Faizaniqbal/KoshurAI_Tarjuma_v2`
LoRA rank (r)	16
LoRA alpha	16
LoRA dropout	0.05
LoRA target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Quantization	4-bit NF4 (BitsAndBytes)
Compute dtype	bfloat16
Epochs	2
Learning rate	1e-4
Effective batch size	8 (2 × grad_accum 4)
Max sequence length	512 tokens
Optimizer	paged_adamw_8bit
LR scheduler	Cosine
Warmup steps	100
Weight decay	0.01

Evaluation — FLORES-200 Devtest (1,012 sentences)

Direction	Model	BLEU	COMET
KS→EN	KoshurAI v3 (ours)	15.74	0.6982 ✅
KS→EN	NLLB-200 distilled-600M	16.28	0.6741
EN→KS	KoshurAI v3 (ours)	30.37¹	0.6604 ✅
EN→KS	NLLB-200 distilled-600M	39.65¹	0.6431

¹ EN→KS BLEU is character-level (tokenize='char'), standard for Arabic-script output. COMET = Unbabel/wmt22-comet-da system score.

KoshurAI v3 outperforms NLLB-200 on COMET in both directions.

Sample Translations (EN→KS)

English	KoshurAI v3
They include the Netherlands, with Anna Jochemsen finishing ninth.	تِیَم چھُ نیدرلینڈس شامِل کَران اَینا جوکیمسن فِنِشِنگ نائنتھ سیتھ
Hershey and Chase used phages, or viruses, to implant their own DNA.	ۂرشے تہٕ چیسن کٔرۍ فیگ تہٕ جَراثیم منٛز پنُن ڈی این اے اَزناوُنہِ خٲطر
They usually have special food, drink and entertainment offers.	تِیَمَن چھُ اکثر خاص کھٮ۪ن، چیٖز تہٕ تفریح پیش کَرنہِ یِوان

Inference Settings

Parameter	Value
`do_sample`	`False` (greedy)
`max_new_tokens`	150 (EN→KS) / 200 (KS→EN)
`min_new_tokens`	5
`repetition_penalty`	1.1

Hardware Requirements

Setting	VRAM
4-bit inference (recommended)	~6–8 GB
Colab free tier (T4)	✅ with 4-bit
Colab L4 / A100	✅ comfortable

Limitations

Trained on sentence-level pairs (≤ 512 tokens); long-form translation unsupported.
Performance on technical, legal, or dialectal Kashmiri is unverified.
No human evaluation conducted; COMET and BLEU are automatic metrics only.
4-bit quantization used for inference; full-precision may yield higher scores.

Citation

If you use this model, please cite:

bibtex
@misc{iqbal2026koshurai,
  title        = {KoshurAI v3: A Fine-Tuned Neural Machine Translation System
                  for Kashmiri--English Bidirectional Translation},
  author       = {Iqbal, Faizan},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/Faizaniqbal/KoshurAI_Tarjuma_v3}},
  note         = {LoRA adapter fine-tuned from Faizaniqbal/KoshurAI_Tarjuma_v2}
}

This work fine-tunes the model by Malik & Nissar — also cite:

bibtex
@misc{malik2026koshurkouter,
  title        = {Koshur Kouter KS-EN v1: A Merged QLoRA Kashmiri--English Translation Model},
  author       = {Malik, Haq Nawaz and Nissar, Nahfid},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/Omarrran/koshur-kouter-ks-en_v1}},
  note         = {Fine-tuned from sarvamai/sarvam-translate}
}

And the original base model:

bibtex
@misc{sarvam2025translate,
  title        = {Sarvam-Translate},
  author       = {{Sarvam AI}},
  howpublished = {\url{https://huggingface.co/sarvamai/sarvam-translate}}
}

Acknowledgements

This model builds on Omarrran/koshur-kouter-ks-en_v1, which was fine-tuned by Haq Nawaz Malik & Nahfid Nissar (2026), itself built on sarvamai/sarvam-translate (Gemma 3, 4.5B) by Sarvam AI. Evaluated on FLORES-200 devtest. COMET scored using Unbabel/wmt22-comet-da.

KoshurAI_Tarjuma_v3

Get help setting up a custom Dedicated Endpoints.

README