modrill

qwen3-4b-think-s1-ep23-full-sft

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Model description

  • Method: full SFT (all weights trainable), DeepSpeed ZeRO-3, 4 GPUs
  • Dataset: ocr_think_50k
  • Template: qwen3
  • Not LoRA / not QLoRA: entire 4B model was updated

Training details

Table
FieldValue
Epochs2
Seed42
cutoff_len24576
packingtrue
neat_packingfalse
per_device_train_batch_size1
gradient_accumulation_steps16
effective_batch_size64
learning_rate5e-5
train_loss0.5416
train_steps604
finished_at2026-06-10 05:23 CST

Optimizer: AdamW (fused), cosine schedule, warmup ratio 0.1. Framework: Transformers 5.6.0, PyTorch 2.8.0+cu128.

Usage

python

from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "modrill/qwen3-4b-think-s1-ep23-full-sft"
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
repo_id,
torch_dtype="auto",
device_map="auto",
trust_remote_code=True,
)

License

Released under Apache 2.0 (see LICENSE in the upstream Qwen model card if not bundled here).

Model provider

modrill

Model tree

Base

Qwen/Qwen3-4B-Base

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today