Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Checkpoints
Every saved training step is a separate git revision. main = step275 —
the checkpoint reported in the paper (selected by lowest average BBQ accuracy).
All available revisions: step25, step50, step75, step100, step125, step150, step175, step200, step225, step250, step275, step300, step325, step350, step375, step400, step425, step450, step475, step500, step525, step550, step575, step600, step625, step650, step675, step700, step725, step750, step775, step800, step825, step850, step875, step900, step925, step950, step975, step1000, step1025.
python
from transformers import AutoModelForCausalLMfrom peft import PeftModelbase = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct")model = PeftModel.from_pretrained(base, "MichiganNLP/Qwen2.5-7B-Instruct-bias-z12-Age-lora", revision="step275")
Details
- Base model:
Qwen/Qwen2.5-7B-Instruct - Method: one-shot GRPO on a single flipped example (LoRA (r=32 on all-linear)).
- Paper example: z̃₁₂ — category Age.
mainrevision:step275, the step reported in the paper.
Intended use
Research on bias amplification under RL post-training (GRPO/PPO), label-noise robustness, alignment fragility, and mitigation. Not for deployment or for producing biased or harmful content.
Model provider
MichiganNLP
Model tree
Base
Qwen/Qwen2.5-7B-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information