groxaxo
Qwento-Agentic
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0What this is
- Type: test run — a single short curriculum stage (2K sequence length), early checkpoint.
- Method: QLoRA (rank 16, α 32) applied to the model's sequence-mixing path
(full-attention
q/k/v/o+ linear-attention input/output projections across all 40 layers), then merged into the BF16 base weights. The 256 MoE experts were left frozen. - Format: BF16 safetensors, drop-in with 🤗 Transformers / vLLM (same architecture and tokenizer as the base).
Training data (curated, publicly available)
A token-balanced blend of cleaned public datasets:
| Source | Focus |
|---|---|
Jackrong/Claude-opus-4.7-TraceInversion-5000x | reasoning / trace-inversion problem solving |
lordx64/reasoning-distill-claude-opus-4-7-max | high-quality reasoning traces |
lordx64/reasoning-distill-opus-4-7-max-sft | instruction-style reasoning SFT |
Infatoshi/kernelbench-mega-traces | GPU-kernel coding traces |
Glint-Research/fable-5-traces | multi-turn agentic coding (tool use) |
All sources were structurally cleaned and quality-filtered before mixing.
Usage
python
from transformers import AutoModelForCausalLM, AutoTokenizerm = AutoModelForCausalLM.from_pretrained("groxaxo/Qwento-Agentic", torch_dtype="bfloat16", device_map="auto", trust_remote_code=True)tok = AutoTokenizer.from_pretrained("groxaxo/Qwento-Agentic")
Limitations
This is a preliminary test checkpoint from a short training run; it has not been
benchmarked and should be treated as experimental. It inherits the license and any usage
restrictions of the base model (Qwen/Qwen-AgentWorld-35B-A3B).
Model provider
groxaxo
Model tree
Base
Qwen/Qwen-AgentWorld-35B-A3B
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information