Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Key Results
| Benchmark | LiteResearcher-4B | Notable Comparison |
|---|---|---|
| GAIA-Text | 71.3% | = Claude-4.5-Sonnet (71.2%) |
| Xbench-DS | 78.0% | > Tongyi DeepSearch 30B (75.0%) |
| Frames | 83.1% | > Claude-4-Sonnet (80.7%) |
| WebWalkerQA | 72.7% | > Tongyi DeepSearch 30B (72.2%) |
All with only 4B parameters — 8–32× smaller than comparable models.
Model Details
- Architecture: Qwen3ForCausalLM (Qwen3-4B-Thinking base)
- Parameters: 4B
- Max Context: 262,144 tokens
- Training: Two-stage difficulty-aware curriculum RL with virtual world environment
- Agent Mode: ReAct-style with
searchandvisittools
How It Works
LiteResearcher operates as a ReAct agent that iteratively:
- Thinks about what information is needed
- Searches the web via Google
- Visits webpages to extract evidence
- Answers when sufficient information is gathered
The model uses <think>, <tool_call>, and <answer> tags to structure its reasoning.
Quick Start
With the Inference Framework
bash
git clone https://github.com/simplex-ai-inc/LiteResearcher.gitcd LiteResearcherpip install -r requirements.txt# Configure API keyscp .env.example .env# Edit .env with your SERPER_KEY_ID and SCRAPEDO_API_KEY# Start SGLang serverpython -m sglang.launch_server \--model-path simplex-ai-inc/LiteResearcher-4B \--port 6001 --tp 2# Run inferencebash scripts/run_all.sh \--model simplex-ai-inc/LiteResearcher-4B \--dataset data/example.jsonl
Direct Usage with Transformers
python
from transformers import AutoModelForCausalLM, AutoTokenizermodel_name = "simplex-ai-inc/LiteResearcher-4B"tokenizer = AutoTokenizer.from_pretrained(model_name)model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")messages = [{"role": "system", "content": "You are a deep research assistant..."},{"role": "user", "content": "Who won the Nobel Prize in Physics in 2024?"}]text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)inputs = tokenizer([text], return_tensors="pt").to(model.device)outputs = model.generate(**inputs, max_new_tokens=4096, temperature=0.6, top_p=0.95)print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))
Training
LiteResearcher is trained with a three-component framework:
- Co-constructed Training Data & Corpus — 32M+ webpages, 1M+ domains, covering five atomic search capabilities (direct retrieval, aggregation, enumeration, cross-verification, statistics)
- Stable Local Tool Environment — Local search engine (BGE-M3 + Milvus) and local browse tool (PostgreSQL) enabling 73.2M tool calls during training at zero marginal cost
- Difficulty-Aware Curriculum RL — Multi-stage training that progressively increases task difficulty and context length
Benchmark Results
LiteResearcher-4B consistently outperforms open-source models up to 8× larger and matches or exceeds proprietary systems across eight benchmarks.
| Model | Size | GAIA | BrowseComp (en) | BrowseComp (zh) | Humanity | Frames | WebWalkerQA | MAIA | Xbench-DS |
|---|---|---|---|---|---|---|---|---|---|
| Commercial Models | |||||||||
| Claude-4-Sonnet | - | 68.3 | 12.2 | 29.1 | 20.3 | 80.7 | 61.7 | - | 64.6 |
| Claude-4.5-Sonnet | - | 71.2 | 19.6 | 40.8 | 24.5 | 85.0 | - | 53.4 | 66.0 |
| DeepSeek-V3.2 | - | 63.5 | 67.6 | 65.0 | 40.8 | 80.2 | - | 38.5 | 71.0 |
| DeepSeek-V3.1 | - | 63.1 | 30.0 | 49.2 | 29.8 | 83.7 | 61.2 | - | 71.0 |
| Minimax-M2 | - | 75.7 | 44.0 | 48.5 | 31.8 | - | - | - | 72.0 |
| OpenAI-GPT-5-high | - | 76.4 | 54.9 | 65.0 | 35.2 | - | - | 51.4 | 77.8 |
| GLM-4.6 | - | 71.9 | 45.1 | 49.5 | 30.4 | - | - | - | 70.0 |
| Kimi-Researcher | - | - | - | - | 26.9 | 78.8 | - | 36.0 | 69.0 |
| Kimi-K2-0905 | - | 60.2 | 7.4 | 22.2 | 21.7 | 58.1 | - | 25.2 | 61.0 |
| Open-Source Models | |||||||||
| Mirothinker | 8B | 66.4 | 31.1 | 40.2 | 21.5 | 80.6 | 60.6 | 40.4 | 60.6 |
| Tongyi DeepSearch | 30B | 70.9 | 43.4 | 46.7 | 32.9 | 90.6 | 72.2 | - | 75.0 |
| ASearcher QWQ v2 | 32B | 58.7 | - | - | - | 74.5 | - | - | 51.1 |
| WebSailor | 30B | 53.2 | - | - | - | - | - | - | 53.3 |
| WebDancer (QwQ) | 32B | 51.5 | 3.8 | 18.0 | - | - | 47.9 | - | 38.3 |
| WebExplorer | 8B | 50.0 | 15.7 | 32.0 | 17.3 | 75.7 | 62.7 | - | 53.7 |
| DeepMiner | 32B | 58.7 | 33.5 | 40.1 | - | - | - | - | 62.0 |
| AFM-RL | 32B | 55.3 | 11.1 | - | 18.0 | - | 63.0 | - | - |
| SFR-DeepResearch | 20B | 66.0 | - | - | 28.7 | 82.8 | - | - | - |
| AgentCPM-Explore | 4B | 63.9 | 24.1 | 29.1 | 19.1 | 82.7 | 68.1 | 40.5 | 70.0 |
| LiteResearcher | 4B | 71.3 | 27.5* | 32.5* | 22.0 | 83.1 | 72.7 | 41.8 | 78.0 |
Best open-source results in bold. Results with * use a 64k context window with a memory mechanism.
Citation
bibtex
@article{li2026literesearcher,title={LiteResearcher: A Scalable Agentic RL Training Framework for Deep Research Agent},author={Wanli Li and Bince Qu and Bo Pan and Jianyu Zhang and Zheng Liu and Pan Zhang and Wei Chen and Bo Zhang},journal={arXiv preprint arXiv:2604.17931},year={2026}}
License
This model is released under the Apache 2.0 License.
Model provider
huggermax
Model tree
Base
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information