Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: mit

Available Checkpoints

SubdirectoryBase ModelMethodStepsKey Results
sft-dr1-7b-finalDeepSeek-R1-Distill-Qwen-7BSFT3651GSM8K 83.5% baseline
grpo-topoprm-dr1-7bDeepSeek-R1-Distill-Qwen-7BGRPO+TopoPRM100Hierarchical reward
grpo-topoprm-qwen35-9bQwen3.5-9BGRPO+TopoPRM50GSM8K 93.5%, MATH500 49.8%
opd-topoprm-dr1-7b-v2DeepSeek-R1-Distill-Qwen-7BOPD Stage3200MATH500 60.8%, Omni-MATH 56.9%
opd-topoprm-qwen35-9b-v2Qwen3.5-9BOPD Stage350Distillation
grpo-scae-qwen35-9bQwen3.5-9BGRPO+SCAE949SCAE variant

Common LoRA Config

All adapters share:

  • r: 64
  • alpha: 128
  • dropout: 0.05
  • target_modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • task_type: CAUSAL_LM

Usage

python

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-7B")
model = PeftModel.from_pretrained(base_model, "rwlinno/topoprm-ckpts", subfolder="grpo-topoprm-dr1-7b")
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-7B")

Citation

bibtex

@inproceedings{topoprm2026,
title={Topology-Aware Process Rewards for Verifiable Mathematical Reasoning},
author={Weilin Ruan},
booktitle={Proceedings of EMNLP 2026},
year={2026}
}

Model provider

rwlinno

Model tree

Base

deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today