properly59

Jumini-Ko-1.2B

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Highlights

🇰🇷 Korean-specialized, from scratch — Llama-3-style architecture (RoPE, GQA, SwiGLU, RMSNorm), 128K byte-level BPE tokenizer, trained from random initialization.
🥇 Beats the size-matched polyglot-ko-1.3b and the larger Tri-1.9B on HAE-RAE and Belebele-Ko (5-shot), the two Korean-language benchmarks emphasized here. (It trails polyglot-ko-1.3b on KoBEST commonsense and KMMLU, and the flagship EXAONE-4.0-1.2B overall.)
🔬 A data-centric recipe — we show that which corpus you continue-pretrain on decides which capability improves (web → commonsense, Wikipedia → knowledge).
📦 Edge-friendly — 1.26B parameters; runs comfortably on a single consumer GPU.

Benchmark Results

Korean benchmarks via the EleutherAI lm-evaluation-harness, 5-shot, accuracy (%). All models evaluated under identical settings. Bold = best, underline = second best.

Table with columns: Benchmark, Jumini-Ko-1.2B (1.26B), polyglot-ko-1.3b (1.43B), Tri-1.9B (1.9B), EXAONE-4.0-1.2B† (1.28B)
Benchmark	Jumini-Ko-1.2B (1.26B)	polyglot-ko-1.3b (1.43B)	Tri-1.9B (1.9B)	EXAONE-4.0-1.2B† (1.28B)
HAE-RAE (Korean knowledge)	21.9	18.7	18.9	30.0
Belebele-Ko (reading)	27.9	22.4	22.9	44.7
KMMLU (knowledge)	24.3	27.8	16.6	32.6
KoBEST (commonsense)	49.5	55.9	50.1	50.6

† EXAONE-4.0-1.2B is a strong flagship model trained on vastly more data/compute, shown as an aspirational reference. Against the open same-tier baselines (polyglot-ko-1.3b, Tri-1.9B), Jumini leads on the Korean-specific HAE-RAE and Belebele-Ko while being the smallest model.

Jumini also beats polyglot-ko-1.3b on 4 of 5 HAE-RAE subtasks (history, loan-word, rare-word, standard-nomenclature). It trails polyglot-ko-1.3b on commonsense (KoBEST) and broad knowledge (KMMLU). Full per-subtask numbers are in the technical report.

Quickstart

python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

repo = "properly59/Jumini-Ko-1.2B"
tok = AutoTokenizer.from_pretrained(repo)
model = AutoModelForCausalLM.from_pretrained(repo, torch_dtype=torch.float16, device_map="auto")

prompt = "### 질문:\n대한민국의 수도는 어디인가요?\n\n### 답변:\n"
ids = tok(tok.bos_token + prompt, return_tensors="pt", add_special_tokens=False).to(model.device)
out = model.generate(**ids, max_new_tokens=128, do_sample=True, temperature=0.8,
                     min_p=0.05, repetition_penalty=1.2, no_repeat_ngram_size=3,
                     pad_token_id=tok.pad_token_id)
print(tok.decode(out[0][ids.input_ids.shape[1]:], skip_special_tokens=True))

Model Details

Table

Architecture	Decoder-only Transformer (Llama-3 family)
Parameters	1.26B (hidden 2048, 28 layers, 32 Q / 8 KV heads, SwiGLU 4096)
Position encoding	RoPE (θ = 500,000)
Tokenizer	Byte-level BPE, 128,000 vocab
Context length	4,096
Precision	bf16 / fp16
License	Apache-2.0

Training

A three-stage, fully-documented pipeline on top of the from-scratch base:

Continued pre-training on a high-quality Korean mixture (FineWeb-2 kor_Hang, KOREAN-WEBTEXT, Korean Wikipedia), document-boundary packed.
Encyclopedic annealing on Korean Wikipedia (LR → 0) — the most token-efficient route to Korean knowledge.
Supervised fine-tuning on a 132K permissively-licensed Korean instruction mixture (KoAlpaca, OpenOrca-KO, KOpen-Platypus, KULLM-v2), with completion-only loss and explicit EOS supervision.

All continued-pretraining and instruction data are public corpora used only for post-training; no external pretrained weights are used. A benchmark decontamination check found 0.00% of benchmark items substantially covered (≥50% of 25-character shingles) by the instruction data.

Intended Use & Limitations

Intended for Korean text generation, QA, summarization, and research on small-model training. As a compact model trained from scratch under a constrained budget, its factual accuracy is limited and it can produce incorrect content; greedy decoding is best paired with a repetition penalty. It trails much larger / higher-budget Korean models (e.g., EXAONE) on knowledge tasks and has not undergone safety alignment. Use for research and non-critical applications only.

Citation

bibtex
@techreport{jumini2026,
  title  = {Jumini-Ko-1.2B Technical Report},
  author = {Cho, Ju-min},
  year   = {2026},
  note   = {https://huggingface.co/properly59/Jumini-Ko-1.2B}
}

Model provider

properly59

Model tree

Base

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Highlights

🇰🇷 Korean-specialized, from scratch — Llama-3-style architecture (RoPE, GQA, SwiGLU, RMSNorm), 128K byte-level BPE tokenizer, trained from random initialization.
🥇 Beats the size-matched polyglot-ko-1.3b and the larger Tri-1.9B on HAE-RAE and Belebele-Ko (5-shot), the two Korean-language benchmarks emphasized here. (It trails polyglot-ko-1.3b on KoBEST commonsense and KMMLU, and the flagship EXAONE-4.0-1.2B overall.)
🔬 A data-centric recipe — we show that which corpus you continue-pretrain on decides which capability improves (web → commonsense, Wikipedia → knowledge).
📦 Edge-friendly — 1.26B parameters; runs comfortably on a single consumer GPU.

Benchmark Results

Korean benchmarks via the EleutherAI lm-evaluation-harness, 5-shot, accuracy (%). All models evaluated under identical settings. Bold = best, underline = second best.

Table with columns: Benchmark, Jumini-Ko-1.2B (1.26B), polyglot-ko-1.3b (1.43B), Tri-1.9B (1.9B), EXAONE-4.0-1.2B† (1.28B)
Benchmark	Jumini-Ko-1.2B (1.26B)	polyglot-ko-1.3b (1.43B)	Tri-1.9B (1.9B)	EXAONE-4.0-1.2B† (1.28B)
HAE-RAE (Korean knowledge)	21.9	18.7	18.9	30.0
Belebele-Ko (reading)	27.9	22.4	22.9	44.7
KMMLU (knowledge)	24.3	27.8	16.6	32.6
KoBEST (commonsense)	49.5	55.9	50.1	50.6

Quickstart

python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

repo = "properly59/Jumini-Ko-1.2B"
tok = AutoTokenizer.from_pretrained(repo)
model = AutoModelForCausalLM.from_pretrained(repo, torch_dtype=torch.float16, device_map="auto")

prompt = "### 질문:\n대한민국의 수도는 어디인가요?\n\n### 답변:\n"
ids = tok(tok.bos_token + prompt, return_tensors="pt", add_special_tokens=False).to(model.device)
out = model.generate(**ids, max_new_tokens=128, do_sample=True, temperature=0.8,
                     min_p=0.05, repetition_penalty=1.2, no_repeat_ngram_size=3,
                     pad_token_id=tok.pad_token_id)
print(tok.decode(out[0][ids.input_ids.shape[1]:], skip_special_tokens=True))

Model Details

Table

Architecture	Decoder-only Transformer (Llama-3 family)
Parameters	1.26B (hidden 2048, 28 layers, 32 Q / 8 KV heads, SwiGLU 4096)
Position encoding	RoPE (θ = 500,000)
Tokenizer	Byte-level BPE, 128,000 vocab
Context length	4,096
Precision	bf16 / fp16
License	Apache-2.0

Training

A three-stage, fully-documented pipeline on top of the from-scratch base:

Continued pre-training on a high-quality Korean mixture (FineWeb-2 kor_Hang, KOREAN-WEBTEXT, Korean Wikipedia), document-boundary packed.
Encyclopedic annealing on Korean Wikipedia (LR → 0) — the most token-efficient route to Korean knowledge.
Supervised fine-tuning on a 132K permissively-licensed Korean instruction mixture (KoAlpaca, OpenOrca-KO, KOpen-Platypus, KULLM-v2), with completion-only loss and explicit EOS supervision.

Intended Use & Limitations

Citation

bibtex
@techreport{jumini2026,
  title  = {Jumini-Ko-1.2B Technical Report},
  author = {Cho, Ju-min},
  year   = {2026},
  note   = {https://huggingface.co/properly59/Jumini-Ko-1.2B}
}

Jumini-Ko-1.2B

Get help setting up a custom Dedicated Endpoints.

README

Highlights

Benchmark Results

Quickstart

Model Details

Training

Intended Use & Limitations

Citation

Explore FriendliAI today

README

Highlights

Benchmark Results

Quickstart

Model Details

Training

Intended Use & Limitations

Citation