TilQazyna/Til-2B-instruct API & Inference Endpoint

Model details


Base	TilQazyna/Til-2B (1977M, dense + MLA)
Format	ChatML (`<
Loss	assistant tokens only
Data	~345K instruction–response pairs (≈70% Kazakh; also ru/en/code/math)
Epochs / LR	3 / 1e-5 (cosine), bf16
Context	4096
Hardware	8×H200, DDP

The instruction mix covers QA, summarization, rephrasing, classification, NER, keyword extraction, simplification, title generation and general chat, with Russian/English/code/math instructions for cross-lingual robustness.

Usage

python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

repo = "TilQazyna/Til-2B-instruct"
tok = AutoTokenizer.from_pretrained(repo)
model = AutoModelForCausalLM.from_pretrained(repo, dtype=torch.bfloat16, device_map="auto")

msgs = [{"role": "user", "content": "Денсаулықты сақтау үшін 3 кеңес бер."}]
text = tok.apply_chat_template(msgs, tokenize=False, add_generation_prompt=True)
ids = tok(text, return_tensors="pt", add_special_tokens=False).input_ids.to(model.device)
out = model.generate(ids, max_new_tokens=300, do_sample=True, temperature=0.7,
                     top_p=0.9, repetition_penalty=1.1, pad_token_id=0,
                     eos_token_id=tok.convert_tokens_to_ids("<|im_end|>"))
print(tok.decode(out[0][ids.shape[1]:], skip_special_tokens=True))

Sample outputs

Q: Абай Құнанбайұлы кім болған? A: Абай Құнанбайұлы — қазақ халқының ұлы ақыны, ойшыл және қоғам қайраткері. Ол қазіргі Қазақстан аумағында дүниеге келген, өмірінің көп бөлігін Семей қаласында өткізген… (өлеңдері мен қара сөздері туралы жалғасады).

Q: Денсаулықты сақтау үшін 3 кеңес бер. A: 1. Көбірек су ішіңіз… 2. Дұрыс тамақтаныңыз… 3. Темекіден бас тартыңыз…

Intended use & limitations

Intended: Kazakh-first assistant for QA, writing, summarization, rewriting.
Reasoning/math: arithmetic and multi-step reasoning are weak (a known limit at this scale); the model may switch to Russian on math prompts.
Factuality: can hallucinate; verify facts and numbers.
No safety alignment / RLHF has been applied.

License

Apache 2.0. Access is gated (manual approval) for usage tracking.

Til-2B-instruct

Get help setting up a custom Dedicated Endpoints.

README