modrill

qwen3-4b-nothink-s1-full-sft

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Model description

  • Method: full SFT (all weights trainable), DeepSpeed ZeRO-2, 4 GPUs
  • Dataset: oci_nothink_50k (50,000 examples)
  • Template: qwen3_nothink
  • Not LoRA / not QLoRA: entire 4B model was updated

Note: The final training save was interrupted by disk full; published weights were restored from checkpoint-782 (same step count as training completion).

Training details

Table
FieldValue
Epochs1
Seed42
cutoff_len4096
packingfalse
per_device_train_batch_size4
gradient_accumulation_steps4
effective_batch_size64 (4 x 4 x 4 GPUs)
learning_rate3e-5
train_loss0.1572
train_steps782
finished_at2026-06-10 06:18 CST
runtime~53 min

Usage

python

from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "modrill/qwen3-4b-nothink-s1-full-sft"
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
repo_id,
torch_dtype="auto",
device_map="auto",
trust_remote_code=True,
)

License

Released under Apache 2.0 (see upstream Qwen model card if not bundled here).

Model provider

modrill

Model tree

Base

Qwen/Qwen3-4B-Base

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today