Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Result (LawBench 191-class, 913-case held-out test, top-1 accuracy)

approachacc
prior SOTA0.450
SIA gpt-oss-120b (W+H)0.701
TF-IDF harness (no LLM)0.760
best: this LoRA (vLLM, rope_theta fix) ⊕ TF-IDF ensemble0.77

Beats SOTA and SIA's 0.701. Honest caveat: the winning 0.77 is an ensemble of this LoRA (served via vLLM with a rope_theta fix) and a TF-IDF char-ngram classifier; the LoRA's contribution is the marginal lift over the 0.760 harness. Naive LoRA inference without the rope_theta fix scored far lower.

Files

  • adapter_model.safetensors — the LoRA weights
  • adapter_config.json, tokenizer.json, chat_template.jinja

Replicate

Task + harness: github.com/evo-hq/evo-posttrainbench (branch evo-variant, src/eval/tasks/lawbench). Optimizer: evo 0.5.0-alpha.13 (github.com/evo-hq/evo, release/0.5). Run: scripts/run.sh run lawbench openai/gpt-oss-120b <hours>.

Model provider

alok97

Model tree

Base

openai/gpt-oss-120b

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today