DJLougen
Qwable-5-27B-Coder
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Release channels
| Repo | Format | Use it when |
|---|---|---|
DJLougen/Qwable-5-27B-Coder | BF16 Transformers safetensors | You want the source checkpoint, further training, conversion, or quality-ceiling evaluation. |
DJLougen/Qwable-5-27B-Coder-GGUF | GGUF | You want llama.cpp, Ollama, or local workstation inference. |
DJLougen/Qwable-5-27B-Coder-NVFP4 | ModelOpt NVFP4 safetensors | You want a compact NVIDIA-serving checkpoint for supported vLLM / TensorRT-LLM stacks. |
Trace stack
text
unsloth/Qwen3.6-27B-> Claude Fable 5 coder-agent traces-> Kimi 2.7 Coder traces-> Qwable-5-27B-Coder
The release is aimed at agentic coding behavior, not benchmark-demo prose. The training signal is trace-shaped: inspect, decide, edit, verify, recover.
| Attribute | Details |
|---|---|
| Base | unsloth/Qwen3.6-27B |
| Architecture tag | qwen3_5 |
| Release format | Transformers + safetensors |
| Approx. weight size | 55.6 GB across 15 safetensors shards |
| Precision | BF16 checkpoint metadata |
| Pipeline | image-text-to-text |
| Context metadata | 262,144 tokens |
| Primary use | coding agents, repository work, terminal workflows, tool-use-style chat |
| License | Apache-2.0 |
What Qwable is tuned to do
- Navigate real repositories instead of isolated snippets.
- Translate failing command output into the next useful patch.
- Keep constraints alive across multi-step coding tasks.
- Produce tool-friendly, implementation-oriented answers.
- Handle long engineering prompts with logs, diffs, stack traces, and partial failures.
- Bias toward concrete edits, commands, and verification over generic advice.
Quickstart
Install a recent Transformers build that supports the Qwen3.6 / Qwen3-VL model family. The checkpoint is large; use device_map="auto" or an equivalent sharded serving setup.
bash
pip install -U transformers accelerate safetensors pillow
Text-only coding prompt:
python
import torchfrom transformers import AutoModelForImageTextToText, AutoProcessormodel_id = "DJLougen/Qwable-5-27B-Coder"processor = AutoProcessor.from_pretrained(model_id)model = AutoModelForImageTextToText.from_pretrained(model_id,torch_dtype="auto",device_map="auto",)messages = [{"role": "system", "content": "You are Qwable, a precise coding agent. Inspect first, patch carefully, verify behavior."},{"role": "user", "content": "Write a Python function that merges overlapping intervals, then explain the edge cases."},]prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)inputs = processor(text=[prompt], return_tensors="pt").to(model.device)with torch.inference_mode():output_ids = model.generate(**inputs,max_new_tokens=1024,do_sample=True,temperature=1.0,top_p=0.95,top_k=20,)new_tokens = output_ids[:, inputs.input_ids.shape[-1]:]print(processor.batch_decode(new_tokens, skip_special_tokens=True)[0])
Download locally:
bash
hf download DJLougen/Qwable-5-27B-Coder --local-dir Qwable-5-27B-Coder
Prompting profile
Qwable works best when the prompt looks like an actual coding task, not a riddle.
Good inputs include the relevant files, exact failing command output, hard constraints, expected output format, tool boundaries, and a verifier command or acceptance test when available.
Suggested system prompt:
text
You are Qwable, a precise coding agent. Inspect before editing. Prefer minimal, correct patches. Preserve existing conventions. Verify behavior with the narrowest meaningful test before finalizing.
For benchmark runs, keep prompts, sampling, max tokens, and tool schema exposure identical between the base model and Qwable. The current generation_config.json uses temperature=1.0, top_p=0.95, and top_k=20.
Evaluation status
Current public status: early maintainer testing only. The maintainer has observed wins over the base model on a private coder benchmark, but reproducible claims require the full packet: benchmark name, split, prompt format, tool schema, harness commit, sampling settings, pass/fail rules, and raw results.
Vision and multimodal note
The repository is configured as image-text-to-text, and the base model family supports image/video tokens through the Qwen vision stack. This fine-tune is marketed for coding behavior. Do not assume it improves vision understanding unless you evaluate that separately.
Limitations
- Public benchmark scores are not published yet.
- The model may inherit failure modes from the base model and from the trace sources.
- Long-context behavior depends on runtime implementation, hardware, KV cache settings, and prompt structure.
- Tool-use quality depends on prompt format and schema consistency.
- This is a large BF16 checkpoint; most local users will need quantization or multi-GPU serving.
- The card does not claim safety alignment beyond the base model and fine-tuning data.
License
Released under Apache-2.0, following the upstream base model license metadata.
Model provider
DJLougen
Model tree
Base
unsloth/Qwen3.6-27B
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information