Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Model details

  • Base model: Qwen/Qwen3-8B (8.2B params, 36 layers, GQA 32/8 heads, YaRN rope scaling)
  • Training: Full-parameter supervised fine-tuning on the ISETrace trajectory corpus
  • Context: up to 40,960 tokens (training max_length); base supports 131,072 with YaRN
  • Precision: bfloat16
  • Format: standard HuggingFace Qwen3ForCausalLM safetensors — loads directly with transformers

The model is trained for multi-turn OS/tool-use agent interaction: it emits <tool_call>...</tool_call> blocks, consumes <tool_response>...</tool_response>, and sustains long task-completion dialogues. It uses the Qwen3 chat template (shipped as chat_template.jinja).


Usage

python

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "valiere/ISETrace-SFT-8B"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto"
)
messages = [
{"role": "user", "content": "List the largest 3 files under /var/log and tell me their sizes."},
]
inputs = tok.apply_chat_template(
messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)
out = model.generate(inputs, max_new_tokens=512, temperature=0.7, top_p=0.8)
print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))

For tool-use, pass your tool schemas via tools= in apply_chat_template; the model produces OpenAI-style tool calls. Serve with vLLM / SGLang for production throughput.


Intended use & limitations

ISETrace-SFT-8B targets macOS/Linux OS-terminal agent tasks — shell execution, file operations, and multi-step tool-use under a user simulator. It does not cover Windows, GUI-based interaction, or browser automation. As a research checkpoint it inherits the biases and knowledge cutoff of Qwen3-8B and the distribution of the ISETrace corpus.

Tool calls executed by an agent built on this model run real commands; sandbox appropriately before granting filesystem or network access.


License & citation

This model is a derivative of Qwen3-8B and is released under the Apache 2.0 license, consistent with the base model. The ISETrace training data is released separately under CC BY 4.0.

bibtex

@misc{isetrace2026,
title = {From Intent to Trajectory: Execution-Grounded Multi-Turn Data Synthesis for OS Agents},
author = {Valiere01},
year = {2026},
howpublished = {\url{https://github.com/Valiere01/ISE-Trace}},
note = {Paper link forthcoming}
}

Model provider

valiere

Model tree

Base

Qwen/Qwen3-8B

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today