1. Introduction
HYZ-01-0.6B-Base is the base (pre-trained only) version of the HYZ-01 series developed by NeuroTürk. It is a raw language model that has undergone multi-stage Turkish continual pre-training (CPT) on top of a multilingual foundation, without any instruction tuning or alignment. It is intended for researchers and developers who wish to fine-tune the model for their own tasks.
The model is built on a multilingual foundation covering 119 languages and has been continuously pre-trained with a focus on Turkish. The tokenizer has been extended specifically for Turkish morphological structure and advanced use cases. HYZ-01-0.6B-Base is the lightweight, open-source base version of HYZ-01, developed by NeuroTürk for Turkish.
Note: This is the base pre-trained version. For the instruction-tuned version, see: HYZ-01-0.6B
2. Model Summary
Continual Pre-Training
- Base model: 4-stage Turkish continual pre-training (CPT) applied on top of a multilingual foundation.
- Training stages include general Turkish web corpus, curated domain data, Wikipedia, and high-quality filtered text.
- Optimization: bfloat16, flash-attention-2, AdamW.
Tokenizer Extension
New special tokens were added to the tokenizer for two purposes:
- Language-structure tokens: To represent Turkish morphological features more efficiently.
- Task and structure tokens: To support structural use cases such as chain-of-thought, code blocks, section markers, and language labels.
The following 20 tokens have been added to the vocabulary and are reserved as infrastructure for future advanced capabilities:
Table with columns: Group, Tokens, Future Use| Group | Tokens | Future Use |
|---|
| Brand | <|neuroturk|> <|hyz01|> <|tr|> <|en|> | Model identity and multilingual control |
| Chain-of-Thought | <|think|> <|/think|> <|step|> <|answer|> | Step-by-step reasoning (CoT) |
| Dialogue | |
3. Model Details
Table with columns: Feature, Value| Feature | Value |
|---|
| Total parameters | 595,798,016 (~0.6B) |
| Non-embedding parameters | 440,467,456 (~0.44B) |
| Hidden dimension | 1,024 |
| Number of layers | 28 |
| Attention heads (Q) | 16 |
| Attention heads (KV) | 8 (GQA) |
| Head dimension | 128 |
| Activation | SiLU |
| Normalization | RMSNorm (ε = 1 × 10⁻⁶) |
4. Training Details
Table with columns: Setting, Value| Setting | Value |
|---|
| Training type | Continual Pre-Training (CPT) |
| Number of stages | 4 |
| Optimization | AdamW |
| Precision | BFloat16 |
| LR schedule | Cosine with warmup |
| Context length | 4,096 tokens |
5. Usage
Warning: This is a base model. It is not instruction-tuned and will not follow instructions reliably. For conversational or task-oriented use, use the instruction-tuned version: HYZ-01-0.6B
Installation
pip install transformers torch accelerate
Text Generation (Completion)
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "neuroturk/HYZ-01-0.6B-Base"
tokenizer = AutoTokenizer.from_pretrained(
model_name,
trust_remote_code=True,
fix_mistral_regex=True
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
)
prompt = "Yapay zeka, bilgisayar sistemlerinin"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=200,
temperature=0.8,
top_p=0.95,
do_sample=True,
repetition_penalty=1.1,
)
new_tokens = outputs[0][inputs['input_ids'].shape[1]:]
print(tokenizer.decode(new_tokens, skip_special_tokens=True))
Low VRAM (4-bit Quantization)
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
)
tokenizer = AutoTokenizer.from_pretrained(
"neuroturk/HYZ-01-0.6B-Base",
trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
"neuroturk/HYZ-01-0.6B-Base",
quantization_config=bnb_config,
device_map="auto",
)
Fine-Tuning with Unsloth
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="neuroturk/HYZ-01-0.6B-Base",
max_seq_length=4096,
load_in_4bit=True,
)
model = FastLanguageModel.get_peft_model(
model,
r=32,
lora_alpha=64,
lora_dropout=0.0,
target_modules=[
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",
],
use_gradient_checkpointing="unsloth",
)
GGUF Quantizations
For faster inference and lower resource usage, GGUF quantized versions of HYZ-01-0.6B-Base are available. These were kindly provided by mradermacher.
You can find them here: HYZ-01-0.6B-Base-GGUF
Using with llama.cpp
- Download the GGUF file (e.g.,
hyz-01-0.6b-base-q4_k_m.gguf) from the repository above.
- Run with
llama.cpp:
./main -m hyz-01-0.6b-base-q4_k_m.gguf -p "Your prompt here" -n 512
For a detailed explanation of quantization types (e.g., Q4_K_M, Q5_K_M), see the llama.cpp documentation.
Note: These GGUF files are not officially maintained by NeuroTürk, but they are community-tested and widely used. Thanks again to mradermacher for the contribution.
6. Limitations
- This is a base model without instruction tuning — it will not follow instructions reliably.
- Complex multi-step reasoning may be limited with 0.6B parameters.
- Biases present in the training data may be reflected in outputs.
- Performance drops significantly in languages other than Turkish.
- Human verification of outputs is recommended for critical applications.
7. Citation
@misc{neuroturk2026hyz01,
author = {NeuroTürk},
title = {HYZ-01-0.6B: A Lightweight Turkish Base Model},
year = 2026,
}