qwen3-4b-think-s1-ep23-full-sft API & Inference Endpoint

Model description

Method: full SFT (all weights trainable), DeepSpeed ZeRO-3, 4 GPUs
Dataset: ocr_think_50k
Template: qwen3
Not LoRA / not QLoRA: entire 4B model was updated

Training details

Table with columns: Field, Value
Field	Value
Epochs	2
Seed	42
cutoff_len	24576
packing	true
neat_packing	false
per_device_train_batch_size	1
gradient_accumulation_steps	16
effective_batch_size	64
learning_rate	5e-5
train_loss	0.5416
train_steps	604
finished_at	2026-06-10 05:23 CST

Optimizer: AdamW (fused), cosine schedule, warmup ratio 0.1. Framework: Transformers 5.6.0, PyTorch 2.8.0+cu128.

No-think SFT (same project): modrill/qwen3-4b-nothink-s1-full-sft - OpenCodeInstruct, qwen3_nothink template
Base: Qwen/Qwen3-4B-Base

Usage

python
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "modrill/qwen3-4b-think-s1-ep23-full-sft"
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)

License

Released under Apache 2.0 (see LICENSE in the upstream Qwen model card if not bundled here).

Model description

Method: full SFT (all weights trainable), DeepSpeed ZeRO-3, 4 GPUs
Dataset: ocr_think_50k
Template: qwen3
Not LoRA / not QLoRA: entire 4B model was updated

Training details

Table with columns: Field, Value
Field	Value
Epochs	2
Seed	42
cutoff_len	24576
packing	true
neat_packing	false
per_device_train_batch_size	1
gradient_accumulation_steps	16
effective_batch_size	64
learning_rate	5e-5
train_loss	0.5416
train_steps	604
finished_at	2026-06-10 05:23 CST

Optimizer: AdamW (fused), cosine schedule, warmup ratio 0.1. Framework: Transformers 5.6.0, PyTorch 2.8.0+cu128.

No-think SFT (same project): modrill/qwen3-4b-nothink-s1-full-sft - OpenCodeInstruct, qwen3_nothink template
Base: Qwen/Qwen3-4B-Base

Usage

python
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "modrill/qwen3-4b-think-s1-ep23-full-sft"
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True,
)

License

Released under Apache 2.0 (see LICENSE in the upstream Qwen model card if not bundled here).

qwen3-4b-think-s1-ep23-full-sft

README

Model description

Training details

Usage

License

Explore FriendliAI today

README

Model description

Training details

Usage

License

qwen3-4b-think-s1-ep23-full-sft

README

Model description

Training details

Related models

Usage

License

Explore FriendliAI today

README

Model description

Training details

Related models

Usage

License