1. Introduction
HYZ-01-Instruct is the instruction-tuned version of the HYZ-01 series developed by NeuroTürk. Building on the base model's strong Turkish language understanding, supervised fine-tuning (SFT) on high-quality instruction-response pairs has improved instruction-following performance across tasks such as conversation, question answering, summarization, and code generation.
The model is built on a multilingual foundation covering 119 languages, followed by Turkish-focused continual pre-training (CPT) and fine-tuning on 372,697 instruction-response pairs. The tokenizer has been extended specifically for Turkish morphological structure and advanced use cases. HYZ-01-0.6B is the lightweight, open-source version of HYZ-01, developed by NeuroTürk for Turkish.
Note: This is the instruction fine-tuned version. For the base model, see: HYZ-01-0.6B-Base
2. Model Summary
Continual Pre-Training and Fine-Tuning
- Base model: 4-stage Turkish continual pre-training (CPT) applied on top of a multilingual foundation.
- Fine-tuning (SFT): 372,697 carefully curated Turkish instruction-response pairs.
- Optimization: LoRA (r=64) + DoRA, bfloat16, flash-attention-2, AdamW.
- Final training loss: 0.6707
Tokenizer Extension
New special tokens were added to the tokenizer for two purposes:
- Language-structure tokens: To represent Turkish morphological features more efficiently.
- Task and structure tokens: To support structural use cases such as chain-of-thought, code blocks, section markers, and language labels.
The following 20 tokens have been added to the vocabulary but were not used during training; they are defined as infrastructure for future advanced capabilities:
Table with columns: Group, Tokens, Future Use| Group | Tokens | Future Use |
|---|
| Brand | <|neuroturk|> <|hyz01|> <|tr|> <|en|> | Model identity and multilingual control |
| Chain-of-Thought | <|think|> <|/think|> <|step|> <|answer|> | Step-by-step reasoning (CoT) |
| Dialogue | |
Note: <|system|> <|user|> <|assistant|> tokens are actively used in the chat template.
3. Model Details
Table with columns: Feature, Value| Feature | Value |
|---|
| Total parameters | 595,798,016 (~0.6B) |
| Non-embedding parameters | 440,467,456 (~0.44B) |
| Hidden dimension | 1,024 |
| Number of layers | 28 |
| Attention heads (Q) | 16 |
| Attention heads (KV) | 8 (GQA) |
| Head dimension | 128 |
| Activation | SiLU |
| Normalization | RMSNorm (ε = 1 × 10⁻⁶) |
4. Training Details
Table with columns: Setting, Value| Setting | Value |
|---|
| Base model training | Multi-stage Turkish CPT |
| Fine-tuning type | Supervised Fine-Tuning (SFT) |
| Fine-tuning data size | 372,697 instruction-response pairs |
| Optimization | LoRA (r=64) + DoRA, AdamW |
| Precision | BFloat16 |
| Final loss | 0.6707 |
| LR schedule | Cosine with warmup |
| Context length | 4,096 tokens |
5. Usage
Installation
pip install transformers torch accelerate
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "neuroturk/HYZ-01-0.6B"
tokenizer = AutoTokenizer.from_pretrained(
model_name,
trust_remote_code=True,
fix_mistral_regex=True
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "Senin adın HYZ-01, NeuroTürk tarafından geliştirilmiş bir Türkçe asistansın."},
{"role": "user", "content": "Yapay zeka nedir?"},
]
inputs = tokenizer.apply_chat_template(
messages,
return_tensors="pt",
add_generation_prompt=True
).to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=200,
temperature=0.8,
top_p=0.95,
do_sample=True,
repetition_penalty=1.1,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Low VRAM (4-bit Quantization)
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
)
tokenizer = AutoTokenizer.from_pretrained(
"neuroturk/HYZ-01-0.6B",
trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
"neuroturk/HYZ-01-0.6B",
quantization_config=bnb_config,
device_map="auto",
)
Additional Fine-Tuning with Unsloth
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="neuroturk/HYZ-01-0.6B",
max_seq_length=4096,
load_in_4bit=True,
)
model = FastLanguageModel.get_peft_model(
model,
r=32,
lora_alpha=64,
lora_dropout=0.0,
target_modules=[
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",
],
use_gradient_checkpointing="unsloth",
)
GGUF Quantizations
For faster inference and lower resource usage, GGUF quantized versions of HYZ-01-0.6B are available. These were kindly provided by mradermacher.
You can find them here: HYZ-01-0.6B-GGUF
Using with llama.cpp
- Download the GGUF file (e.g.,
hyz-01-0.6b-q4_k_m.gguf) from the repository above.
- Run with
llama.cpp
./main -m hyz-01-0.6b-q4_k_m.gguf -p "Your prompt here" -n 512
For a detailed explanation of quantization types (e.g., Q4_K_M, Q5_K_M), see the llama.cpp documentation.
Note: These GGUF files are not officially maintained by NeuroTürk, but they are community-tested and widely used. Thanks again to mradermacher for the contribution.
6. Chat Template
{% for message in messages %}
{% if message['role'] == 'system' %}
<|system|>
{{ message['content'] }}<|endoftext|>
{% elif message['role'] == 'user' %}
<|user|>
{{ message['content'] }}<|endoftext|>
{% elif message['role'] == 'assistant' %}
<|assistant|>
{{ message['content'] }}<|endoftext|>
{% endif %}
{% endfor %}
{% if add_generation_prompt %}<|assistant|>
{% endif %}
7. Evaluation Results
All evaluations were conducted using lm-evaluation-harness.
Table with columns: Task, Category, Setting, Score| Task | Category | Setting | Score |
|---|
| TurBLiMP (ditransitive) | Grammar | 0-shot | 89.10% |
| TurBLiMP (transitive) | Grammar | 0-shot | 86.40% |
| XCOPA TR | Causality | 0-shot | 56.80% |
| XNLI TR | Natural language inference | 0-shot | |
Note: XQuAD TR was evaluated in generative question-answering format. The Exact Match (EM) score appears low due to strict string matching requirements; the F1 score better reflects the model's actual performance.
Note: TokSuite TR and MGSM TR evaluations are ongoing; results will be added upon completion.
The model may perform somewhat better than benchmark scores indicate on tasks such as everyday conversation, text summarization, code generation, and open-ended question answering.
8. Limitations
- Although the model is successful at instruction following, it may occasionally produce incorrect or inconsistent outputs.
- Complex multi-step reasoning may be limited with 0.6B parameters.
- Biases present in the training data may be reflected in outputs.
- Performance drops significantly in languages other than Turkish.
- Human verification of outputs is recommended for critical applications.
9. Citation
@misc{neuroturk2026hyz01,
author = {NeuroTürk},
title = {HYZ-01-0.6B: A Lightweight Turkish Instruction Model},
year = 2026,
}