cmu-lti/osim-8b-post API & Inference Endpoint

Intended use

Simulating the human/user side of conversations — user simulation for agent evaluation, social simulation, persona / role-play. Conditioned on a "social-context" system prompt (who is speaking: role, goal, background, style); given the other party's turns it generates the next human turn.

Results

Evaluated out-of-distribution as the user simulator in the τ-USI agentic benchmark (τ-bench airline+retail, 165 tasks, fixed GPT-5.2 agent), OSim-8B reaches USI 75.6 — the best behavioral / specialized user simulator, surpassing same-size general instruct models and every prior specialized simulator (CoSER-8B 67.2, UserLM-8B 62.0). It is distinctively human-like in reactivity (Sørensen–Dice D4 ≈ 93, matching the human inter-annotator level) and in outcome calibration (best ECE among compared models), with essentially none of the long-horizon agentic failure modes (timeouts/perseveration) seen in non-behavioral baselines.

Training

Base: Qwen3-8B
Stages: midtraining on the OdysSim corpus → task-specific reinforcement learning + expert consolidation.

Citation

If you use this model, please cite the OdysSim paper (Building Foundation Models for Human Behavior Simulation). Code: https://github.com/sunnweiwei/OdysSim

Intended use

Results

Training

Base: Qwen3-8B
Stages: midtraining on the OdysSim corpus → task-specific reinforcement learning + expert consolidation.

Citation

If you use this model, please cite the OdysSim paper (Building Foundation Models for Human Behavior Simulation). Code: https://github.com/sunnweiwei/OdysSim

osim-8b-post

Get help setting up a custom Dedicated Endpoints.

README

Intended use

Results

Training

Citation

Explore FriendliAI today

README

Intended use

Results

Training

Citation