Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Overview

This model is trained to serve as a general-purpose evaluation judge that scores responses based on user-specified rubrics. It supports arbitrary input formats — you only need to specify the desired output format in your prompt.

Key Features:

  • Multi-dimensional rubric-based scoring
  • Flexible input: any QA pair + custom rubric
  • Structured JSON output with per-criterion scores and reasoning
  • Trained on diverse evaluation datasets via online self-distillation

Usage

With Transformers + PEFT

python

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3.6-27B",
torch_dtype="auto",
device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "Uranus/Qwen3.6-27B-JudgeOPSD-0604")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.6-27B")
prompt = """你是专业评分法官,按rubric对QA多维度打分,输出严格JSON格式,不要多余内容。
问题:长方形周长48,最大面积是多少?
待评回答:周长48,长+宽=24,最大面积144,正方形时最大。
评分维度:1.答案正确性(权重0.8) 2.公式使用(权重0.15) 3.逻辑完整性(权重0.05)
输出格式:{"score":0~1,"item_detail":[{"criterion":"","single_score":0~1,"weight":0~1,"reason":""}],"total_reason":""}"""
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.7)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)

With vLLM (Recommended for Production)

python

from vllm import LLM, SamplingParams
llm = LLM(
model="Qwen/Qwen3.6-27B",
enable_lora=True,
max_lora_rank=64,
max_model_len=4096,
)
sampling_params = SamplingParams(temperature=0.7, max_tokens=2048)
from vllm.lora.request import LoRARequest
lora_request = LoRARequest("judge", 1, "Uranus/Qwen3.6-27B-JudgeOPSD-0604")
prompt = """你是专业评分法官,按rubric对QA多维度打分,输出严格JSON格式,不要多余内容。
问题:什么是光合作用?
待评回答:光合作用是植物利用阳光将二氧化碳和水转化为葡萄糖和氧气的过程。
评分维度:1.准确性(权重0.6) 2.完整性(权重0.3) 3.表达清晰度(权重0.1)
输出格式:{"score":0~1,"item_detail":[{"criterion":"","single_score":0~1,"weight":0~1,"reason":""}],"total_reason":""}"""
outputs = llm.generate(prompt, sampling_params, lora_request=lora_request)
print(outputs[0].outputs[0].text)

Prompt Format

The model is flexible with input format. A typical prompt structure:

markdown

你是专业评分法官,按rubric对QA多维度打分,输出严格JSON格式,不要多余内容。
问题:{question}
待评回答:{answer}
评分维度:{rubric_dimensions}
输出格式:{desired_json_schema}

Expected Output Example:

json

{
"score": 0.85,
"item_detail": [
{"criterion": "答案正确性", "single_score": 0.9, "weight": 0.8, "reason": "答案正确,正方形时面积最大为144"},
{"criterion": "公式使用", "single_score": 0.8, "weight": 0.15, "reason": "使用了周长公式但未明确写出"},
{"criterion": "逻辑完整性", "single_score": 0.7, "weight": 0.05, "reason": "推理步骤较简略"}
],
"total_reason": "回答正确且核心推理完整,但公式展示和推理步骤可更详细"
}

Training Details

HyperparameterValue
MethodLoRA + OPSD (Online Policy Self-Distillation)
LoRA Rank64
LoRA Alpha128
Learning Rate1e-5
Epochs1
Batch Size1 × 8 (grad accum) × 8 GPUs = effective 64
Max Sequence Length4096
Max Completion Length2048
Temperature1.0
OPSD Beta0.5
Hardware8 × NVIDIA H20-98G

Training Data

A mixture of 4 evaluation/feedback datasets:

Limitations

  • Optimized for rubric-based scoring tasks; may not generalize well to open-ended generation
  • Best performance with structured output prompts specifying JSON format
  • Score calibration may vary across different rubric scales

Citation

If you find this model useful, please cite:

bibtex

@misc{qwen36-judgeopsd-0604,
title={Qwen3.6-27B-JudgeOPSD-0604},
author={Uranus},
year={2026},
url={https://huggingface.co/Uranus/Qwen3.6-27B-JudgeOPSD-0604}
}

License

This model inherits the Apache 2.0 License from the base model.

Model provider

Uranus

Uranus

Model tree

Base

Qwen/Qwen3.6-27B

Adapter

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today