Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
OpenRubrics/RubricARROW-8B-Rubric
This is an 8B rubric generation model, introduced in the paper RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains.
It is finetuned from Qwen3/Qwen3-8B.
Usage
python
from transformers import AutoModelForCausalLM, AutoTokenizermodel_id = "OpenRubrics/RubricARROW-8B-Rubric"tok = AutoTokenizer.from_pretrained(model_id, use_fast=True)model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto")
To evaluate the model, please use the following format to build up message.
python
RUBRIC_PROMPT_TEMPLATE = ("Your task is to extract a set of rubric-style instructions from a user's request.\n""These rubrics will be used as evaluation criteria to check if a response fully meets the request.\n""Every rubric item must be a universal principle. If any rubric still contains topic-specific references (e.g., names, places, myths, numbers, historical facts), it is automatically invalid.\n""\n""- **Two Distinct Categories:**\n"" - [Hard Rule]: Derived strictly from explicit requirements stated in the <request> (format, length, structure, forbidden/required elements, etc.).\n"" - [Principle]: Derived by abstracting any concrete cues into domain-agnostic quality criteria (e.g., clarity, correctness, sound reasoning, pedagogy).\n""\n""- **Comprehensiveness:**\n"" The rubric must cover all critical aspects implied by the request and examples, including explicit requirements and implicit quality standards.\n""\n""- **Conciseness & Uniqueness:**\n"" Each rubric must capture a distinct evaluation criterion. Overlapping or redundant criteria must be merged into a single rubric. Wording must be precise and free of repetition.\n""\n""- **Format Requirements:**\n"" - Use a numbered list.\n"" - Each item starts with \"The response\" phrased in third person.\n"" - Append [Hard Rule] or [Principle] at the end of each item.\n"" - Do not include reasoning, explanations, or examples in the final output—only the rubrics.\n""\n""Here is the request:\n""{prompt}\n""\n""Please generate the rubrics for the above request.")user_text = RUBRIC_PROMPT_TEMPLATE.format(prompt=instruction,)messages_list = [{"role": "user", "content": user_text},]message = tok.apply_chat_template(messages_list,tokenize=False,add_generation_prompt=True,enable_thinking=False)# Remaining step: Use either HF or vLLM for evaluation.# ...# ...
Citation
If you find our work helpful, please consider citing our paper:
bibtex
@misc{jiang2026rubric,title={RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains},author={Haoxiang Jiang and Zihan Dong and Tianci Liu and Wanying Wang and Ran Xu and Tony Yu and Linjun Zhang and Haoyu Wang},year={2026},eprint={2605.29156},archivePrefix={arXiv},primaryClass={cs.LG},url={https://arxiv.org/abs/2605.29156},}
Model provider
OpenRubrics
Model tree
Base
Qwen/Qwen3-8B
Fine-tuned
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information