Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

1. Introduction

HYZ-01-0.6B-Base is the base (pre-trained only) version of the HYZ-01 series developed by NeuroTürk. It is a raw language model that has undergone multi-stage Turkish continual pre-training (CPT) on top of a multilingual foundation, without any instruction tuning or alignment. It is intended for researchers and developers who wish to fine-tune the model for their own tasks.

The model is built on a multilingual foundation covering 119 languages and has been continuously pre-trained with a focus on Turkish. The tokenizer has been extended specifically for Turkish morphological structure and advanced use cases. HYZ-01-0.6B-Base is the lightweight, open-source base version of HYZ-01, developed by NeuroTürk for Turkish.

Note: This is the base pre-trained version. For the instruction-tuned version, see: HYZ-01-0.6B


2. Model Summary

Continual Pre-Training

  • Base model: 4-stage Turkish continual pre-training (CPT) applied on top of a multilingual foundation.
  • Training stages include general Turkish web corpus, curated domain data, Wikipedia, and high-quality filtered text.
  • Optimization: bfloat16, flash-attention-2, AdamW.

Tokenizer Extension

New special tokens were added to the tokenizer for two purposes:

  • Language-structure tokens: To represent Turkish morphological features more efficiently.
  • Task and structure tokens: To support structural use cases such as chain-of-thought, code blocks, section markers, and language labels.

The following 20 tokens have been added to the vocabulary and are reserved as infrastructure for future advanced capabilities:

GroupTokensFuture Use
Brand<|neuroturk|> <|hyz01|> <|tr|> <|en|>Model identity and multilingual control
Chain-of-Thought<|think|> <|/think|> <|step|> <|answer|>Step-by-step reasoning (CoT)
Dialogue<|system|> <|user|> <|assistant|> <|end|>Multi-turn dialogue and role management
Code<|code|> <|/code|> <|output|> <|error|>Structured code generation and debugging
Structure<|title|> <|section|> <|list|> <|note|>Long-form and structured text generation (reports, articles, etc.)

3. Model Details

FeatureValue
Total parameters595,798,016 (~0.6B)
Non-embedding parameters440,467,456 (~0.44B)
Hidden dimension1,024
Number of layers28
Attention heads (Q)16
Attention heads (KV)8 (GQA)
Head dimension128
ActivationSiLU
NormalizationRMSNorm (ε = 1 × 10⁻⁶)
Positional encodingRoPE (θ = 1,000,000)
Vocabulary size151,690
Training context length4,096 tokens
Theoretical max context32,768 tokens
PrecisionBFloat16
VRAM usage (fp16)~1.11 GB
Disk size~1.11 GB

4. Training Details

SettingValue
Training typeContinual Pre-Training (CPT)
Number of stages4
OptimizationAdamW
PrecisionBFloat16
LR scheduleCosine with warmup
Context length4,096 tokens

5. Usage

Warning: This is a base model. It is not instruction-tuned and will not follow instructions reliably. For conversational or task-oriented use, use the instruction-tuned version: HYZ-01-0.6B

Installation

bash

pip install transformers torch accelerate

Text Generation (Completion)

python

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "neuroturk/HYZ-01-0.6B-Base"
tokenizer = AutoTokenizer.from_pretrained(
model_name,
trust_remote_code=True,
fix_mistral_regex=True
)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
)
prompt = "Yapay zeka, bilgisayar sistemlerinin"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=200,
temperature=0.8,
top_p=0.95,
do_sample=True,
repetition_penalty=1.1,
)
new_tokens = outputs[0][inputs['input_ids'].shape[1]:]
print(tokenizer.decode(new_tokens, skip_special_tokens=True))

Low VRAM (4-bit Quantization)

python

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
)
tokenizer = AutoTokenizer.from_pretrained(
"neuroturk/HYZ-01-0.6B-Base",
trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
"neuroturk/HYZ-01-0.6B-Base",
quantization_config=bnb_config,
device_map="auto",
)

Fine-Tuning with Unsloth

python

from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="neuroturk/HYZ-01-0.6B-Base",
max_seq_length=4096,
load_in_4bit=True,
)
model = FastLanguageModel.get_peft_model(
model,
r=32,
lora_alpha=64,
lora_dropout=0.0,
target_modules=[
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",
],
use_gradient_checkpointing="unsloth",
)

6. Limitations

  • This is a base model without instruction tuning — it will not follow instructions reliably.
  • Complex multi-step reasoning may be limited with 0.6B parameters.
  • Biases present in the training data may be reflected in outputs.
  • Performance drops significantly in languages other than Turkish.
  • Human verification of outputs is recommended for critical applications.

7. Citation

bibtex

@misc{neuroturk2026hyz01,
author = {NeuroTürk},
title = {HYZ-01-0.6B: A Lightweight Turkish Base Model},
year = 2026,
}

Model provider

neuroturk

Model tree

Base

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today