Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0What this model does
Given a concurrent Go program and a partial execution trace (goroutine scheduler events), predict the next scheduler event:
markdown
Input: Go program source + partial trace (GoStart, GoBlock, GoUnblock, GoCreate, GoEnd, GoSched events)Output: {"event_type": "GoBlock", "goroutine_id": 3, "reasoning": "...", "confidence": "high"}
Training
| Setting | Value |
|---|---|
| Base model | Qwen/Qwen2.5-Coder-1.5B-Instruct |
| Method | QLoRA (4-bit NF4 + LoRA r=8) |
| Dataset | kavirubc/weave-bench — 1,377 train / 366 val |
| Epochs | 3 |
| Hardware | NVIDIA A40 48GB |
| train_loss | 0.094 |
| eval_loss | 0.326 |
Results (Phase 12)
| Model | Accuracy | Notes |
|---|---|---|
| Qwen2.5-Coder-1.5B fine-tuned (this model) | 40.2% | Phase 12 |
| Qwen2.5-Coder-1.5B zero-shot | TBD | not yet measured |
| Gemini zero-shot (Phase 4 baseline) | 56.0% | different, much larger model |
Note: The 56% Gemini baseline is not a direct comparison — it used a much larger model with no fine-tuning. The relevant comparison is Qwen2.5-Coder-1.5B zero-shot vs fine-tuned, which has not yet been measured. This is planned as the next evaluation step. Investigation into the 40.2% result is ongoing — likely cause is metadata fields (
concurrency_pattern,nondeterminism) missing from the JSONL.
This is an experimental research checkpoint published for reproducibility. See the GitHub repo and dataset for full context.
Usage
python
from transformers import AutoModelForCausalLM, AutoTokenizerfrom peft import PeftModelimport torchbase = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-1.5B-Instruct",torch_dtype=torch.bfloat16,device_map="auto",)model = PeftModel.from_pretrained(base, "kavirubc/weave-ccwm-qwen2.5-coder-1.5b-lora")tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-Coder-1.5B-Instruct")
Or use the eval script from the repo:
bash
uv run python scripts/run_eval.py \--adapter kavirubc/weave-ccwm-qwen2.5-coder-1.5b-lora \--val_file dataset/output/kaggle_upload/val_point_dups.jsonl
Citation
bibtex
@misc{weave2026,author = {Hapuarachchi, Kaviru},title = {Weave: Concurrent Code World Models},year = {2026},url = {https://github.com/kaviru2/Weave}}
Model provider
kavirubc
Model tree
Base
Qwen/Qwen2.5-Coder-1.5B-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information