Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model details
| Architecture | Llama-style decoder (RoPE, RMSNorm, SwiGLU, GQA) |
| Parameters | 1.246 B (tied input/output embeddings) |
| Hidden / layers | 2048 / 16 |
| Attention heads | 32 query / 8 KV (GQA) |
| Intermediate | 5632 |
| Context length | 2048 |
| Vocab | 256 000 (morpheme-BPE) |
| Precision | bf16 |
Training
| Tokens | 6.26 B (1 epoch) |
| Train blocks | 3 057 865 × 2048 |
| Corpus | cleaned → MinHash-deduped (11.29 M / 13.19 M docs kept, 85.6 %) |
| Hardware | 8 × NVIDIA H200, FSDP full-shard, bf16 |
| Optimizer | AdamW (β 0.9/0.95, wd 0.1), cosine LR 3e-4, warmup 200 |
| Effective batch | 512 blocks (8 × 16 × grad-accum 4) ≈ 1.05 M tok/step |
| Throughput | ~313 K tok/s |
| Wall-clock | ~5 h 40 m |
| Final loss | ~2.90 (train) |
Usage
python
import torchfrom transformers import AutoTokenizer, AutoModelForCausalLMname = "stukenov/Til-Core-1B"tok = AutoTokenizer.from_pretrained(name)m = AutoModelForCausalLM.from_pretrained(name, dtype=torch.bfloat16).cuda().eval()ids = tok("Қазақстан Республикасы — ", return_tensors="pt").input_ids.cuda()out = m.generate(ids, max_new_tokens=50, do_sample=True,temperature=0.8, top_p=0.95, repetition_penalty=1.2)print(tok.decode(out[0], skip_special_tokens=True))
Sample generations
Қазақстан Республикасы — мемлекеттік рәміздері. Жалпы білім беретін мектептің 6-сыныбына арналған оқулық…
Жасанды интеллект дегеніміз бұл адам миының эволюциясы, ойлау жүйесі мен мінез-құлқының ерекшеліктерін…
Менің Отаным — «Отан» туралы өлеңді мәнерлеп оқу… Біздің Отанымыз қалай аталады?…
Limitations
- Base model — no instruction following, no safety alignment.
- Single epoch on a 6.26 B-token corpus; factual reliability is limited.
- Corpus skews toward educational / encyclopedic Kazakh text; occasional rare-token artifacts in generation.
- Kazakh-centric; not optimized for other languages.
Roadmap
- Til Core 1B Instruct — SFT on Kazakh instruction data (see plan in repo).
- A smaller instruct sibling for on-device use.
Citation
markdown
@misc{tilcore1b2026,title = {Til Core 1B: a Kazakh base language model with a morpheme-BPE tokenizer},author = {Tukenov, Saken},year = {2026}}
Model provider
TilQazyna
Model tree
Base
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information