INFUSER-Qwen3-4B-base API & Inference Endpoint

Summary

Paper: INFUSER: Influence-Guided Self-Evolution Improves Reasoning
Blog: y-agent.github.io/infuser
Base model: Qwen/Qwen3-4B-Base
Method: INFUSER
Code repo: https://github.com/FFishy-git/INFUSER
Data repo: Siyuc/infuser-data

Evaluation

Released Checkpoint Scores

Table with columns: Category, Benchmark, Score
Category	Benchmark	Score
General	MMLU-Pro	60.68%
General	GPQA-Diamond	35.35%
General	SuperGPQA	33.90%
General	BBEH	12.57%
Math & physics	MATH500	77.90%
Math & physics	AIME2024

Comparison Summary

Category and overall means are computed over the same benchmark groups as the paper's main table. INFUSER avg is the average over three seeded INFUSER runs. R-Few (paper) and SPICE (paper) are self-reported values from their original papers, so missing categories are shown as -.

Table with columns: Category, This model, INFUSER avg, Base, R-Zero, AZR, R-Few (paper), SPICE (paper)
Category	This model	INFUSER avg	Base	R-Zero	AZR	R-Few (paper)	SPICE (paper)
General reasoning	35.62%	35.43%	29.46%	32.18%	32.93%	34.33%	35.00%
Math & physics reasoning	25.93%	25.73%	21.34%

Usage

python
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "Siyuc/INFUSER-Qwen3-4B-base"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(repo_id)

Citation

bibtex
@misc{chen2026infuser,
  title        = {INFUSER: Influence-Guided Self-Evolution Improves Reasoning},
  author       = {Siyu Chen and Miao Lu and Beining Wu and Heejune Sheen and Fengzhuo Zhang and Shuangning Li and Zhiyuan Li and Jose Blanchet and Tianhao Wang and Zhuoran Yang},
  year         = {2026},
  eprint       = {2606.09052},
  archivePrefix = {arXiv},
  primaryClass = {cs.LG},
  url          = {https://arxiv.org/abs/2606.09052}
}

Summary

Paper: INFUSER: Influence-Guided Self-Evolution Improves Reasoning
Blog: y-agent.github.io/infuser
Base model: Qwen/Qwen3-4B-Base
Method: INFUSER
Code repo: https://github.com/FFishy-git/INFUSER
Data repo: Siyuc/infuser-data

Evaluation

Released Checkpoint Scores

Table with columns: Category, Benchmark, Score
Category	Benchmark	Score
General	MMLU-Pro	60.68%
General	GPQA-Diamond	35.35%
General	SuperGPQA	33.90%
General	BBEH	12.57%
Math & physics	MATH500	77.90%
Math & physics	AIME2024

Comparison Summary

Table with columns: Category, This model, INFUSER avg, Base, R-Zero, AZR, R-Few (paper), SPICE (paper)
Category	This model	INFUSER avg	Base	R-Zero	AZR	R-Few (paper)	SPICE (paper)
General reasoning	35.62%	35.43%	29.46%	32.18%	32.93%	34.33%	35.00%
Math & physics reasoning	25.93%	25.73%	21.34%

Usage

python
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "Siyuc/INFUSER-Qwen3-4B-base"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(repo_id)

Citation

bibtex
@misc{chen2026infuser,
  title        = {INFUSER: Influence-Guided Self-Evolution Improves Reasoning},
  author       = {Siyu Chen and Miao Lu and Beining Wu and Heejune Sheen and Fengzhuo Zhang and Shuangning Li and Zhiyuan Li and Jose Blanchet and Tianhao Wang and Zhuoran Yang},
  year         = {2026},
  eprint       = {2606.09052},
  archivePrefix = {arXiv},
  primaryClass = {cs.LG},
  url          = {https://arxiv.org/abs/2606.09052}
}

INFUSER-Qwen3-4B-base

README

Summary

Evaluation

Released Checkpoint Scores

Comparison Summary

Usage

Citation

Explore FriendliAI today

README

Summary

Evaluation

Released Checkpoint Scores

Comparison Summary

Usage

Citation