Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

OpenRubrics/RubricARROW-8B-Rubric

This is an 8B rubric generation model, introduced in the paper RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains.

It is finetuned from Qwen3/Qwen3-8B.

Usage

python

from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "OpenRubrics/RubricARROW-8B-Rubric"
tok = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto")

To evaluate the model, please use the following format to build up message.

python

RUBRIC_PROMPT_TEMPLATE = (
"Your task is to extract a set of rubric-style instructions from a user's request.\n"
"These rubrics will be used as evaluation criteria to check if a response fully meets the request.\n"
"Every rubric item must be a universal principle. If any rubric still contains topic-specific references (e.g., names, places, myths, numbers, historical facts), it is automatically invalid.\n"
"\n"
"- **Two Distinct Categories:**\n"
" - [Hard Rule]: Derived strictly from explicit requirements stated in the <request> (format, length, structure, forbidden/required elements, etc.).\n"
" - [Principle]: Derived by abstracting any concrete cues into domain-agnostic quality criteria (e.g., clarity, correctness, sound reasoning, pedagogy).\n"
"\n"
"- **Comprehensiveness:**\n"
" The rubric must cover all critical aspects implied by the request and examples, including explicit requirements and implicit quality standards.\n"
"\n"
"- **Conciseness & Uniqueness:**\n"
" Each rubric must capture a distinct evaluation criterion. Overlapping or redundant criteria must be merged into a single rubric. Wording must be precise and free of repetition.\n"
"\n"
"- **Format Requirements:**\n"
" - Use a numbered list.\n"
" - Each item starts with \"The response\" phrased in third person.\n"
" - Append [Hard Rule] or [Principle] at the end of each item.\n"
" - Do not include reasoning, explanations, or examples in the final output—only the rubrics.\n"
"\n"
"Here is the request:\n"
"{prompt}\n"
"\n"
"Please generate the rubrics for the above request."
)
user_text = RUBRIC_PROMPT_TEMPLATE.format(
prompt=instruction,
)
messages_list = [
{"role": "user", "content": user_text},
]
message = tok.apply_chat_template(
messages_list,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False
)
# Remaining step: Use either HF or vLLM for evaluation.
# ...
# ...

Citation

If you find our work helpful, please consider citing our paper:

bibtex

@misc{jiang2026rubric,
title={RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains},
author={Haoxiang Jiang and Zihan Dong and Tianci Liu and Wanying Wang and Ran Xu and Tony Yu and Linjun Zhang and Haoyu Wang},
year={2026},
eprint={2605.29156},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2605.29156},
}

Model provider

OpenRubrics

Model tree

Base

Qwen/Qwen3-8B

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today