Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

⚠️ A tool-using agent, not a standalone chatbot

The model emits <tool_call> shell commands that must be executed against the Wikipedia corpus and fed back as <tool_response> turns. To use it you need: (1) the corpus PeterJinGo/wiki-18-corpus, (2) a tool-calling vLLM server, and (3) the GrepSeek inference harness (grep tool

Usage

bash

git clone https://github.com/alirezasalemi7/grepseek && cd grepseek
# env: TRAINING_ENV.md · corpus: cold_start_sft/download_corpus.py
# 1. serve this checkpoint
MODEL_PATH=alireza7/GrepSeek-Qwen3.5-9B-SFT bash rl/serve_rl.sh # -> http://localhost:10730/v1
# 2. run the agent (paper inference: temperature 0.6, <=6 turns, 16k context)
GREPSEEK_CORPUS_ROOT=/path/to/wiki_18_corpus \
bash inference/run_inference.sh --base_url http://localhost:10730/v1 \
--model grepseek --temperature 0.6 --input my_questions.jsonl --out_dir out

Evaluation (token-F1 / EM, micro-average over 7 QA benchmarks)

This SFT-only policy already substantially beats the untuned base model, but RL adds large gains on multi-hop reasoning:

variantmicro-avg F1micro-avg EM
base (no SFT, no RL)0.33140.2836
this model (SFT only)0.42490.3569
+ GRPO → GrepSeek-Qwen3.5-9B-GRPO0.56910.4948

(7 benchmarks: NQ, TriviaQA, PopQA, HotpotQA, 2WikiMultiHopQA, MuSiQue, Bamboogle; trained only on NQ + HotpotQA, the rest are out-of-distribution.)

License

Inherits the license of the base model Qwen/Qwen3.5-9B — confirm and update the license field above if needed.

Citation

bibtex

@misc{salemi2026grepseektrainingsearchagents,
title={GrepSeek: Training Search Agents for Direct Corpus Interaction},
author={Alireza Salemi and Chang Zeng and Atharva Nijasure and Jui-Hui Chung and Razieh Rahimi and Fernando Diaz and Hamed Zamani},
year={2026},
eprint={2605.29307},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2605.29307},
}

Model provider

alireza7

alireza7

Model tree

Base

Qwen/Qwen3.5-9B

Fine-tuned

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today