Siyuc

INFUSER-rlvr-Qwen3-8B-base

README

License: apache-2.0

Description

INFUSER is an iterative co-training framework that enables foundation model agents to autonomously discover and practice reasoning skills from unstructured documents. It co-evolves two roles:

Generator: Reads unstructured documents and drafts reasoning questions with reference answers.
Solver: Learns to answer these questions through reinforcement learning, guided by an optimizer-aware influence score that measures how much a question actually improves the solver's target distribution.

This specific checkpoint uses a hybrid training recipe combining science documents with Putnam/AIME-style RLVR math data.

Summary

Base model: Qwen/Qwen3-8B-Base
Method: Math-RLVR + INFUSER
Code repo: https://github.com/FFishy-git/INFUSER
Data repo: Siyuc/infuser-data

Evaluation

Released Checkpoint Scores

Table with columns: Category, Benchmark, Score
Category	Benchmark	Score
General	MMLU-Pro	65.80%
General	GPQA-Diamond	43.54%
General	SuperGPQA	36.43%
General	BBEH	13.51%
Math & physics	MATH500	84.85%
Math & physics	AIME2024

Comparison Summary

Category and overall means use the same benchmark groups as the paper.

Table with columns: Category, This model, Math-RLVR + INFUSER avg, Science-only INFUSER avg
Category	This model	Math-RLVR + INFUSER avg	Science-only INFUSER avg
General reasoning	39.82%	39.37%	40.62%
Math & physics reasoning	33.07%	32.52%	31.49%
Medical	38.89%	39.39%	40.52%
Coding	52.00%	52.49%	53.29%

Usage

python
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "Siyuc/INFUSER-rlvr-Qwen3-8B-base"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(repo_id)

Citation

If you find this model or method useful, please cite:

bibtex
@misc{chen2026infuser,
  title        = {INFUSER: Influence-Guided Self-Evolution Improves Reasoning},
  author       = {Siyu Chen and Miao Lu and Beining Wu and Heejune Sheen and Fengzhuo Zhang and Shuangning Li and Zhiyuan Li and Jose Blanchet and Tianhao Wang and Zhuoran Yang},
  year         = {2026},
  eprint       = {2606.09052},
  archivePrefix = {arXiv},
  primaryClass = {cs.LG},
  url          = {https://arxiv.org/abs/2606.09052}
}

Notes

The repository root is intentionally flattened so the tokenizer files, config files, and model shard files are available directly at the top level for standard transformers loading.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Model Details

Model Provider

Siyuc

Model Tree

Base

Qwen/Qwen3-8B-Base

Fine-tuned

this model

Input Modalities

Text

Output Modalities