Tesslate
OmniCoder-9B
Coding agent model, fine-tuned on 425K+ real-world agentic coding trajectories to deliver strong error recovery, tool use, and multi-step reasoning capabilities.
Overview
OmniCoder-9B is a 9-billion parameter coding agent model built by Tesslate, fine-tuned on top of Qwen3.5-9B's hybrid architecture (Gated Delta Networks interleaved with standard attention). It was trained on 425,000+ curated agentic coding trajectories spanning real-world software engineering tasks, tool use, terminal operations, and multi-step reasoning.
The training data was specifically built from Claude Opus 4.6 agentic and coding reasoning traces, targeting scaffolding patterns from Claude Code, OpenCode, Codex, and Droid. The dataset includes successful trajectories from models like Claude Opus 4.6, GPT-5.4, GPT-5.3-Codex, and Gemini 3.1 Pro.
The model shows strong agentic behavior: it recovers from errors (read-before-write), responds to LSP diagnostics, and uses proper edit diffs instead of full rewrites. These patterns were learned directly from the real-world agent trajectories it was trained on.
Key Features
- Trained on Frontier Agent Traces : Built from Claude Opus 4.6, GPT-5.3-Codex, GPT-5.4, and Gemini 3.1 Pro agentic coding trajectories across Claude Code, OpenCode, Codex, and Droid scaffolding
- Hybrid Architecture : Inherits Qwen3.5's Gated Delta Networks interleaved with standard attention for efficient long-context processing
- 262K Native Context : Full 262,144 token context window, extensible to 1M+
- Error Recovery : Learns read-before-write patterns, responds to LSP diagnostics, and applies minimal edit diffs instead of full rewrites
- Thinking Mode : Supports
<think>...</think>reasoning chains for complex problem decomposition - Apache 2.0 : Fully open weights, no restrictions
Architecture
OmniCoder inherits Qwen3.5-9B's hybrid architecture:
- Gated Delta Networks : Linear attention layers interleaved with standard attention for efficient long-range dependencies
- VLM Backbone : Built on Qwen3_5ForConditionalGeneration
Limitations
- Performance on non-English tasks has not been extensively evaluated
- Tool-calling format is flexible but works best with the scaffolding patterns seen in training
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
Model provider
Tesslate
Model tree
Base
Qwen/Qwen3.5-9B
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information