AlexWortega

AlexWortega

qwen35-4b-clawd-rift

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Usage

python

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
base = AutoModelForCausalLM.from_pretrained('Qwen/Qwen3.5-4B', dtype=torch.bfloat16, device_map='cuda')
model = PeftModel.from_pretrained(base, 'AlexWortega/qwen35-4b-clawd-rift')
tok = AutoTokenizer.from_pretrained('AlexWortega/qwen35-4b-clawd-rift')

Or with sglang:

bash

python -m sglang.launch_server --model-path Qwen/Qwen3.5-4B \
--lora-paths clawd-rift=AlexWortega/qwen35-4b-clawd-rift \
--tool-call-parser hermes

Evaluation results

tbench-2 (89 docker tasks via Pi-style runner)

7/89 (7.9%). Tasks unique to clawd-rift: fix-ocaml-gc, pytorch-model-recovery.

Table
Variant in pipelinePass on tbench-2
ckpt600 (Soyuz SFT only)7
clawd-100 (+ ClawGym 100 steps)7
clawd-200 (+ ClawGym 200 steps)7
clawd-rft (positive-only SFT on rollouts)6
clawd-rift (true RIFT on rollouts) — this model7

ClawGym-Bench (200 tasks via openclaw scaffold)

Table
StatValue
mean0.371
half+ (≥0.5)80/200 (40%)
perfect (=1.0)2 (tasks 78, 148)
zero40

Comparison to RUC-AIBOX ClawGym leaderboard (compact open-weight models):

Table
ModelClawGym avg
Qwen3-32B33.11
Qwen3-8B35.02
clawd-rift (this, 4B, QLoRA, 1 GPU)37.10
Qwen3-30A3B (MoE)45.11
ClawGym-4B (RUC-AIBOX full SFT)47.73

Optimal inference parameters

Sampling sweet-spot is scaffold-dependent.

Table
ScaffoldTask typeOptimal sampling
openclaw (ClawGym-style formal spec)JSON/Markdown to schemaT=0.3-0.5, top_p=0.95, no min_p
pi-agent (terminus_runner shell explore)trial-and-error commandsT=0.7-0.8, top_p=0.95, min_p=0.05

Universal default that loses only ~5% on each:

markdown

temperature=0.5, top_p=0.95, top_k=40, repetition_penalty=1.05

Training methodology — pipeline of 3 stages

Stage 1: Soyuz SFT (ckpt600 — base agent format)

QLoRA r=64 alpha=128 on Qwen/Qwen3.5-4B.

  • Datasets: AlexWortega/Soyuz-sft + AlexWortega/AgentTrove
  • Format: Hermes-style JSON tool calls (<tool_call>{"name":...,"arguments":...}</tool_call>)
  • 600 steps total, seq=8K, Muon optimizer for LoRA matrices
  • Output: ckpt-400, ckpt-600 (intermediate); soup_sum = ckpt400 + ckpt600 (arithmetic merge)

Stage 2: ClawGym continue-train (clawd-100, clawd-200 — openclaw scaffold adaptation)

Continue-train ckpt600 on filtered RUC-AIBOX/ClawGym-Trajectory.

  • 1937 trajectories (filtered ≤16K tokens out of 24.5K)
  • 200 steps, seq=16K, LR=1e-4, AdamW
  • Hermes chat template + openclaw native tools (read/write/exec/web_search/...)
  • Output: clawd-100 (mid), clawd-200 (final)

Stage 3: RIFT — own rollouts + reward feedback

True RIFT loss on top of clawd-200:

python

# positive (reward > 0): NLL × reward — weighted SFT
# negative (reward = 0): exp(logp) × negative_scale — unlikelihood

Repos

Table
AssetLinkSize
LoRA adapterqwen35-4b-clawd-rift340 MB
Merged bf16qwen35-4b-clawd-rift-merged8.4 GB
GGUF (4 quants)qwen35-4b-clawd-rift-gguf18.6 GB
Raw evalsqwen35-4b-clawd-rift-evals<1 MB

GGUF breakdown:

  • clawd-rift-f16.gguf (7.9 GB, baseline)
  • clawd-rift-Q8_0.gguf (4.2 GB, near-lossless)
  • clawd-rift-Q5_K_M.gguf (2.9 GB, recommended)
  • clawd-rift-Q4_K_M.gguf (2.6 GB, smallest)

W&B training logs: https://wandb.ai/alexwortega/vae-llm-agents


A cleaner, stronger reference for the Stage-1 base (Soyuz SFT only — no ClawGym, no RIFT) is now available, trained as full bf16 LoRA r=128 (vs QLoRA r=64 here):

Final eval on Soyuz-clean held-out: loss=0.247, token_acc=0.936. Trained on the cleaned 11-stream subset of AlexWortega/Soyuz-sft at seq=16K, 1 epoch.

Useful if you want only the Hermes-tool-call SFT without the ClawGym/RIFT specialization.

Model provider

AlexWortega

AlexWortega

Model tree

Base

Qwen/Qwen3.5-4B

Adapter

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today