Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Overview
This model is trained to serve as a general-purpose evaluation judge that scores responses based on user-specified rubrics. It supports arbitrary input formats — you only need to specify the desired output format in your prompt.
Key Features:
- Multi-dimensional rubric-based scoring
- Flexible input: any QA pair + custom rubric
- Structured JSON output with per-criterion scores and reasoning
- Trained on diverse evaluation datasets via online self-distillation
Usage
With Transformers + PEFT
python
from peft import PeftModelfrom transformers import AutoModelForCausalLM, AutoTokenizerbase_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.6-27B",torch_dtype="auto",device_map="auto",)model = PeftModel.from_pretrained(base_model, "Uranus/Qwen3.6-27B-JudgeOPSD-0604")tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.6-27B")prompt = """你是专业评分法官,按rubric对QA多维度打分,输出严格JSON格式,不要多余内容。问题:长方形周长48,最大面积是多少?待评回答:周长48,长+宽=24,最大面积144,正方形时最大。评分维度:1.答案正确性(权重0.8) 2.公式使用(权重0.15) 3.逻辑完整性(权重0.05)输出格式:{"score":0~1,"item_detail":[{"criterion":"","single_score":0~1,"weight":0~1,"reason":""}],"total_reason":""}"""messages = [{"role": "user", "content": prompt}]text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)inputs = tokenizer(text, return_tensors="pt").to(model.device)outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.7)response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)print(response)
With vLLM (Recommended for Production)
python
from vllm import LLM, SamplingParamsllm = LLM(model="Qwen/Qwen3.6-27B",enable_lora=True,max_lora_rank=64,max_model_len=4096,)sampling_params = SamplingParams(temperature=0.7, max_tokens=2048)from vllm.lora.request import LoRARequestlora_request = LoRARequest("judge", 1, "Uranus/Qwen3.6-27B-JudgeOPSD-0604")prompt = """你是专业评分法官,按rubric对QA多维度打分,输出严格JSON格式,不要多余内容。问题:什么是光合作用?待评回答:光合作用是植物利用阳光将二氧化碳和水转化为葡萄糖和氧气的过程。评分维度:1.准确性(权重0.6) 2.完整性(权重0.3) 3.表达清晰度(权重0.1)输出格式:{"score":0~1,"item_detail":[{"criterion":"","single_score":0~1,"weight":0~1,"reason":""}],"total_reason":""}"""outputs = llm.generate(prompt, sampling_params, lora_request=lora_request)print(outputs[0].outputs[0].text)
Prompt Format
The model is flexible with input format. A typical prompt structure:
markdown
你是专业评分法官,按rubric对QA多维度打分,输出严格JSON格式,不要多余内容。问题:{question}待评回答:{answer}评分维度:{rubric_dimensions}输出格式:{desired_json_schema}
Expected Output Example:
json
{"score": 0.85,"item_detail": [{"criterion": "答案正确性", "single_score": 0.9, "weight": 0.8, "reason": "答案正确,正方形时面积最大为144"},{"criterion": "公式使用", "single_score": 0.8, "weight": 0.15, "reason": "使用了周长公式但未明确写出"},{"criterion": "逻辑完整性", "single_score": 0.7, "weight": 0.05, "reason": "推理步骤较简略"}],"total_reason": "回答正确且核心推理完整,但公式展示和推理步骤可更详细"}
Training Details
| Hyperparameter | Value |
|---|---|
| Method | LoRA + OPSD (Online Policy Self-Distillation) |
| LoRA Rank | 64 |
| LoRA Alpha | 128 |
| Learning Rate | 1e-5 |
| Epochs | 1 |
| Batch Size | 1 × 8 (grad accum) × 8 GPUs = effective 64 |
| Max Sequence Length | 4096 |
| Max Completion Length | 2048 |
| Temperature | 1.0 |
| OPSD Beta | 0.5 |
| Hardware | 8 × NVIDIA H20-98G |
Training Data
A mixture of 4 evaluation/feedback datasets:
Limitations
- Optimized for rubric-based scoring tasks; may not generalize well to open-ended generation
- Best performance with structured output prompts specifying JSON format
- Score calibration may vary across different rubric scales
Citation
If you find this model useful, please cite:
bibtex
@misc{qwen36-judgeopsd-0604,title={Qwen3.6-27B-JudgeOPSD-0604},author={Uranus},year={2026},url={https://huggingface.co/Uranus/Qwen3.6-27B-JudgeOPSD-0604}}
License
This model inherits the Apache 2.0 License from the base model.
Model provider
Uranus
Model tree
Base
Qwen/Qwen3.6-27B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information