zbeeb

Qwen2.5-0.5B-Instruct-Math-SFT-100K-2ep

README

License: apache-2.0

Fine-tuned from Qwen/Qwen2.5-0.5B-Instruct on a 100K math reasoning SFT mixture for 2 epochs with learning rate 1e-5.

Prompt format used during training:

text
System: Please reason step by step, and put your final answer within \boxed{}.
User: {problem}
Assistant: {solution}

Training mixture:

Table with columns: Source, Count
Source	Count
nvidia/OpenMathReasoning CoT	40,000
AI-MO/NuminaMath-1.5 filtered, no AMC/AIME source	25,000
meta-math/MetaMathQA	15,000
MATH train, especially levels 4-5	15,000
GSM8K train	5,000

Training summary:

Evaluation results are not included in this model card yet.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Container

Run this model inference with full control and performance in your environment.

Model Details

Model Provider

zbeeb

Model Tree

Base

Qwen/Qwen2.5-0.5B-Instruct

Fine-tuned

this model

Input Modalities

Text

Output Modalities