Erland

mini-qwen3-1m

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Architecture

Table with columns: Parameter, Value
Parameter	Value
Parameters	1,217,608
Model type	`qwen3`
Hidden size	8
Layers	2
Intermediate size	32
Attention heads	1
KV heads	1
Head dimension	8
Max position embeddings	4,096
Vocab size	151,936
Tensor dtype	`bfloat16`
Tokenizer source	`Qwen/Qwen3-0.6B` local mirror

How this model was created

scripts/tools/create_mock_qwen3.py in the Relax ROCm Megatron workspace:

Loads the tokenizer and config metadata from a local Qwen3-0.6B checkpoint.
Shrinks the Qwen3 config dimensions to the table above.
Initializes Qwen3ForCausalLM with random weights.
Ties word embeddings and saves the model as safetensors.
Writes mock_qwen3_info.json with the exact generation metadata.

The model weights are random. Only tokenizer/chat-template metadata is copied from Qwen3-0.6B.

Reproduction

From the Relax ROCm Megatron repository:

bash
source /vast/users/qirong.ho/miniforge3/etc/profile.d/conda.sh
conda activate relaxrl_rocm
python scripts/tools/create_mock_qwen3.py \
  --tokenizer-source /vast/users/qirong.ho/erland/Python_project/relax_e2e_assets/Qwen3-0.6B \
  --output-dir /vast/users/qirong.ho/erland/Python_project/relax_e2e_assets/Qwen3-Mock-1M

Relax e2e validation

This checkpoint was validated with the Relax AMD ROCm e2e launcher:

bash
NUM_ROLLOUT=2 SAVE_INTERVAL=1 CKPT_FORMAT=torch_dist NO_SAVE_OPTIM=0 \
WANDB_GROUP="qwen3-mock-1m-tmux-20260531_095214" \
./amd_qwen3_mock_2gpu_e2e.sh

Validation evidence:

Ray job: raysubmit_sGx5uTXcKu41nHzL
W&B run: me4ticfh
completed Actor training completed step 0/2
completed Actor training completed step 1/2
saved torch_dist checkpoints at iterations 0 and 1
checkpoint metadata contains optimizer state keys, including optimizer.state.exp_avg and optimizer.state.exp_avg_sq

The e2e validation exercised:

Hugging Face model load
SGLang transformers rollout
Megatron Qwen3Bridge import
distributed weight update
optimizer step
W&B application metrics
optimizer-inclusive torch_dist checkpoint save

Intended use

Fast Relax/Megatron/SGLang startup and integration tests
ROCm smoke tests where Qwen3 code paths matter more than model quality
Checkpointing and resume infrastructure checks
Debugging model-provider, tokenizer, rollout, and weight-sync wiring

Not intended for

Inference quality evaluation
Benchmarking Qwen3 capability
Any downstream task
Reward/loss quality analysis

Because the model is random and extremely small, generated text is expected to be nonsense. During the validation run, rewards were invalid/negative and advantages collapsed to zero; this is expected for this smoke checkpoint.

Model provider

Erland

Model tree

Base

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Architecture

Table with columns: Parameter, Value
Parameter	Value
Parameters	1,217,608
Model type	`qwen3`
Hidden size	8
Layers	2
Intermediate size	32
Attention heads	1
KV heads	1
Head dimension	8
Max position embeddings	4,096
Vocab size	151,936
Tensor dtype	`bfloat16`
Tokenizer source	`Qwen/Qwen3-0.6B` local mirror

How this model was created

scripts/tools/create_mock_qwen3.py in the Relax ROCm Megatron workspace:

Loads the tokenizer and config metadata from a local Qwen3-0.6B checkpoint.
Shrinks the Qwen3 config dimensions to the table above.
Initializes Qwen3ForCausalLM with random weights.
Ties word embeddings and saves the model as safetensors.
Writes mock_qwen3_info.json with the exact generation metadata.

The model weights are random. Only tokenizer/chat-template metadata is copied from Qwen3-0.6B.

Reproduction

From the Relax ROCm Megatron repository:

bash
source /vast/users/qirong.ho/miniforge3/etc/profile.d/conda.sh
conda activate relaxrl_rocm
python scripts/tools/create_mock_qwen3.py \
  --tokenizer-source /vast/users/qirong.ho/erland/Python_project/relax_e2e_assets/Qwen3-0.6B \
  --output-dir /vast/users/qirong.ho/erland/Python_project/relax_e2e_assets/Qwen3-Mock-1M

Relax e2e validation

This checkpoint was validated with the Relax AMD ROCm e2e launcher:

bash
NUM_ROLLOUT=2 SAVE_INTERVAL=1 CKPT_FORMAT=torch_dist NO_SAVE_OPTIM=0 \
WANDB_GROUP="qwen3-mock-1m-tmux-20260531_095214" \
./amd_qwen3_mock_2gpu_e2e.sh

Validation evidence:

Ray job: raysubmit_sGx5uTXcKu41nHzL
W&B run: me4ticfh
completed Actor training completed step 0/2
completed Actor training completed step 1/2
saved torch_dist checkpoints at iterations 0 and 1
checkpoint metadata contains optimizer state keys, including optimizer.state.exp_avg and optimizer.state.exp_avg_sq

The e2e validation exercised:

Hugging Face model load
SGLang transformers rollout
Megatron Qwen3Bridge import
distributed weight update
optimizer step
W&B application metrics
optimizer-inclusive torch_dist checkpoint save

Intended use

Fast Relax/Megatron/SGLang startup and integration tests
ROCm smoke tests where Qwen3 code paths matter more than model quality
Checkpointing and resume infrastructure checks
Debugging model-provider, tokenizer, rollout, and weight-sync wiring

Not intended for

Inference quality evaluation
Benchmarking Qwen3 capability
Any downstream task
Reward/loss quality analysis

mini-qwen3-1m

Get help setting up a custom Dedicated Endpoints.

README

Architecture

How this model was created

Reproduction

Relax e2e validation

Intended use

Not intended for

Explore FriendliAI today

README

Architecture

How this model was created

Reproduction

Relax e2e validation

Intended use

Not intended for