Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model details
- Base model: Qwen/Qwen3-8B (8.2B params, 36 layers, GQA 32/8 heads, YaRN rope scaling)
- Training: Full-parameter supervised fine-tuning on the ISETrace trajectory corpus
- Context: up to 40,960 tokens (training
max_length); base supports 131,072 with YaRN - Precision: bfloat16
- Format: standard HuggingFace
Qwen3ForCausalLMsafetensors — loads directly withtransformers
The model is trained for multi-turn OS/tool-use agent interaction: it emits <tool_call>...</tool_call> blocks, consumes <tool_response>...</tool_response>, and sustains long task-completion dialogues. It uses the Qwen3 chat template (shipped as chat_template.jinja).
Usage
python
from transformers import AutoModelForCausalLM, AutoTokenizerimport torchmodel_id = "valiere/ISETrace-SFT-8B"tok = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")messages = [{"role": "user", "content": "List the largest 3 files under /var/log and tell me their sizes."},]inputs = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)out = model.generate(inputs, max_new_tokens=512, temperature=0.7, top_p=0.8)print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
For tool-use, pass your tool schemas via tools= in apply_chat_template; the model
produces OpenAI-style tool calls. Serve with vLLM / SGLang for production throughput.
Intended use & limitations
ISETrace-SFT-8B targets macOS/Linux OS-terminal agent tasks — shell execution, file operations, and multi-step tool-use under a user simulator. It does not cover Windows, GUI-based interaction, or browser automation. As a research checkpoint it inherits the biases and knowledge cutoff of Qwen3-8B and the distribution of the ISETrace corpus.
Tool calls executed by an agent built on this model run real commands; sandbox appropriately before granting filesystem or network access.
License & citation
This model is a derivative of Qwen3-8B and is released under the Apache 2.0 license, consistent with the base model. The ISETrace training data is released separately under CC BY 4.0.
bibtex
@misc{isetrace2026,title = {From Intent to Trajectory: Execution-Grounded Multi-Turn Data Synthesis for OS Agents},author = {Valiere01},year = {2026},howpublished = {\url{https://github.com/Valiere01/ISE-Trace}},note = {Paper link forthcoming}}
Model provider
valiere
Model tree
Base
Qwen/Qwen3-8B
Fine-tuned
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information