Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.01. Introduction
HYZ-01-0.6B-Base is the base (pre-trained only) version of the HYZ-01 series developed by NeuroTürk. It is a raw language model that has undergone multi-stage Turkish continual pre-training (CPT) on top of a multilingual foundation, without any instruction tuning or alignment. It is intended for researchers and developers who wish to fine-tune the model for their own tasks.
The model is built on a multilingual foundation covering 119 languages and has been continuously pre-trained with a focus on Turkish. The tokenizer has been extended specifically for Turkish morphological structure and advanced use cases. HYZ-01-0.6B-Base is the lightweight, open-source base version of HYZ-01, developed by NeuroTürk for Turkish.
Note: This is the base pre-trained version. For the instruction-tuned version, see: HYZ-01-0.6B
2. Model Summary
Continual Pre-Training
- Base model: 4-stage Turkish continual pre-training (CPT) applied on top of a multilingual foundation.
- Training stages include general Turkish web corpus, curated domain data, Wikipedia, and high-quality filtered text.
- Optimization: bfloat16, flash-attention-2, AdamW.
Tokenizer Extension
New special tokens were added to the tokenizer for two purposes:
- Language-structure tokens: To represent Turkish morphological features more efficiently.
- Task and structure tokens: To support structural use cases such as chain-of-thought, code blocks, section markers, and language labels.
The following 20 tokens have been added to the vocabulary and are reserved as infrastructure for future advanced capabilities:
| Group | Tokens | Future Use |
|---|---|---|
| Brand | <|neuroturk|> <|hyz01|> <|tr|> <|en|> | Model identity and multilingual control |
| Chain-of-Thought | <|think|> <|/think|> <|step|> <|answer|> | Step-by-step reasoning (CoT) |
| Dialogue | <|system|> <|user|> <|assistant|> <|end|> | Multi-turn dialogue and role management |
| Code | <|code|> <|/code|> <|output|> <|error|> | Structured code generation and debugging |
| Structure | <|title|> <|section|> <|list|> <|note|> | Long-form and structured text generation (reports, articles, etc.) |
3. Model Details
| Feature | Value |
|---|---|
| Total parameters | 595,798,016 (~0.6B) |
| Non-embedding parameters | 440,467,456 (~0.44B) |
| Hidden dimension | 1,024 |
| Number of layers | 28 |
| Attention heads (Q) | 16 |
| Attention heads (KV) | 8 (GQA) |
| Head dimension | 128 |
| Activation | SiLU |
| Normalization | RMSNorm (ε = 1 × 10⁻⁶) |
| Positional encoding | RoPE (θ = 1,000,000) |
| Vocabulary size | 151,690 |
| Training context length | 4,096 tokens |
| Theoretical max context | 32,768 tokens |
| Precision | BFloat16 |
| VRAM usage (fp16) | ~1.11 GB |
| Disk size | ~1.11 GB |
4. Training Details
| Setting | Value |
|---|---|
| Training type | Continual Pre-Training (CPT) |
| Number of stages | 4 |
| Optimization | AdamW |
| Precision | BFloat16 |
| LR schedule | Cosine with warmup |
| Context length | 4,096 tokens |
5. Usage
Warning: This is a base model. It is not instruction-tuned and will not follow instructions reliably. For conversational or task-oriented use, use the instruction-tuned version: HYZ-01-0.6B
Installation
bash
pip install transformers torch accelerate
Text Generation (Completion)
python
from transformers import AutoTokenizer, AutoModelForCausalLMimport torchmodel_name = "neuroturk/HYZ-01-0.6B-Base"tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True,fix_mistral_regex=True)model = AutoModelForCausalLM.from_pretrained(model_name,torch_dtype=torch.bfloat16,device_map="auto",)prompt = "Yapay zeka, bilgisayar sistemlerinin"inputs = tokenizer(prompt, return_tensors="pt").to(model.device)outputs = model.generate(**inputs,max_new_tokens=200,temperature=0.8,top_p=0.95,do_sample=True,repetition_penalty=1.1,)new_tokens = outputs[0][inputs['input_ids'].shape[1]:]print(tokenizer.decode(new_tokens, skip_special_tokens=True))
Low VRAM (4-bit Quantization)
python
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfigimport torchbnb_config = BitsAndBytesConfig(load_in_4bit=True,bnb_4bit_compute_dtype=torch.bfloat16,bnb_4bit_use_double_quant=True,bnb_4bit_quant_type="nf4",)tokenizer = AutoTokenizer.from_pretrained("neuroturk/HYZ-01-0.6B-Base",trust_remote_code=True,)model = AutoModelForCausalLM.from_pretrained("neuroturk/HYZ-01-0.6B-Base",quantization_config=bnb_config,device_map="auto",)
Fine-Tuning with Unsloth
python
from unsloth import FastLanguageModelmodel, tokenizer = FastLanguageModel.from_pretrained(model_name="neuroturk/HYZ-01-0.6B-Base",max_seq_length=4096,load_in_4bit=True,)model = FastLanguageModel.get_peft_model(model,r=32,lora_alpha=64,lora_dropout=0.0,target_modules=["q_proj", "k_proj", "v_proj", "o_proj","gate_proj", "up_proj", "down_proj",],use_gradient_checkpointing="unsloth",)
6. Limitations
- This is a base model without instruction tuning — it will not follow instructions reliably.
- Complex multi-step reasoning may be limited with 0.6B parameters.
- Biases present in the training data may be reflected in outputs.
- Performance drops significantly in languages other than Turkish.
- Human verification of outputs is recommended for critical applications.
7. Citation
bibtex
@misc{neuroturk2026hyz01,author = {NeuroTürk},title = {HYZ-01-0.6B: A Lightweight Turkish Base Model},year = 2026,}
Model provider
neuroturk
Model tree
Base
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information