Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
Model Summary
SWE-Qwen3-14B is a LoRA fine-tuned SWE agent model based on Qwen3-14B, trained on 20K filtered trajectories from SWE-Star collected under a modified OpenHands scaffold.
🔗 Data: SWE-Openhands-Devstral-32k-20K
Training Configuration
| Item | Value |
|---|---|
| Base Model | Qwen3-14B |
| PEFT | LoRA (rank=32, alpha=64, dropout=0.1) |
| Target Modules | All linear layers (q/k/v/o/up/down/gate_proj) |
| Training Data | 20K filtered SWE-Star trajectories |
| Max Context | 32,768 tokens |
| Epochs | 2 (1,250 steps) |
| Batch Size | 16 (micro=1/GPU, grad_accum=8) |
| Learning Rate | 1e-4, cosine, warmup 5% |
| History Truncation | keep_fraction=0.5 |
| Hardware | 2× GPU (FSDP2) |
Context length and history truncation. We use a maximum context length of 32,768 tokens. Since many agent trajectories exceed this limit, we enable history truncation with a keep fraction of 0.5: when a trajectory exceeds the window, the oldest turns are dropped while preserving the most recent 50%. This ensures the model always sees the most relevant context (recent edits, test results, error messages) rather than losing the tail end, which typically contains the final fix and submission.
Usage
python
from peft import PeftModelfrom transformers import AutoModelForCausalLM, AutoTokenizerbase_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-14B", torch_dtype="auto", device_map="auto")tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-14B")model = PeftModel.from_pretrained(base_model, "ubicloud/SWE-Qwen3-14B")
Model provider
ubicloud
Model tree
Base
Qwen/Qwen3-14B
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information