Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Prompt format

Unified JSON format: a system prompt (task + output schema) + a numbered user sequence → one JSON answer ({"reasoning": "...", "steps": [...]} for next-step/completion;

markdown

{"reasoning": "...", "valid": bool, "rule": "RULE_..."|null}
for anomaly). Build the exact messages with zo_train.prompts.build_messages from the project repo, then apply the tokenizer chat template. See the flagship model card for a full from_pretrained snippet.

Evaluation (MOSFET labeled eval, n≈200)

taskthis checkpointn-gram baseline
next-step (top-1)0.4350.69
sequence completion (block-acc)0.5000.637
anomaly (F1)0.0000.89

Full study + all checkpoints: the project repo and submissions/XCombinator/REPORT.md.

Notes

  • Full fine-tune (not a LoRA adapter) — loads directly with AutoModelForCausalLM.from_pretrained.
  • Trained on Leonardo (CINECA) A100 via a deterministic data factory over the organizer grammar.

Model provider

XCombinator

Model tree

Base

Qwen/Qwen2.5-1.5B-Instruct

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today