RMDWLLC/kaiju-coder-7-adapter API & Inference Endpoint

Summary

Kaiju Coder 7 by Kiyomi is an RMDW/Kiyomi business-owner coding adapter trained on reviewed, RMDW-owned or RMDW-authored examples. It is designed for practical small-business build work: websites, proposals, intake/CRM flows, Stripe/payment implementation planning, reports, ROI dashboards, automations, operator handbooks, lead generation, sales follow-up, repo patches, and Kiyomi 7.7.7 style AI-company setup packs.

The current release-candidate product path is:

text
Qwen3.6-27B base
-> Kaiju v1.8 LoRA adapter
-> merged full-model artifact for raw local serving
-> Kaiju system prompt
-> deterministic business-owner harnesses
-> verifier/static checks

Do not describe this package as raw weights alone producing every final artifact. The deterministic harness is part of the tested product path.

Base Model

Base model: Qwen/Qwen3.6-27B
Checked upstream revision: 6a9e13bd6fc8f0983b9b99948120bc37f49c13e9
Upstream license metadata: apache-2.0
Upstream license copy: release/upstream/qwen3.6-27b/LICENSE

Attribution wording:

text
Kaiju Coder 7 by Kiyomi is fine-tuned from Qwen under Apache 2.0.

Do not imply endorsement by Qwen, Alibaba, or upstream authors.

Adapter

Adapter path: runs/qwen36-27b-lora-v1.8-business-owner/adapter
Adapter type: LoRA / PEFT
LoRA rank: 16
LoRA alpha: 32
LoRA dropout: 0.02
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable parameter count: approximately

Merged Local Artifact

Remote merged path: /home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged
Size: 51G
Shards: 14 safetensor shards plus tokenizer/config sidecars
Served model name: kaiju-coder-7
Merge script: scripts/run-gojira-b-qwen36-lora-merge.sh
Serving script: scripts/start-qwen36-merged-sglang.sh

Training

Dataset build: datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl
Reviewed candidate examples: 1,689
SFT rows after controlled business-owner oversampling: 1,881
Train examples: 1,769
Eval examples: 112
Training runtime: 11666.7564s
Training loss: 0.9281658741335074
Max training length: 2048
Training config: training/configs/qwen36-27b-lora-v1.8-business-owner.example.json

Data Provenance

Training data is source-backed and RMDW-owned or RMDW-authored. Client-site repositories are used only as generalized pattern/eval sources unless explicitly reviewed for training eligibility.

Relevant release files:

release/SOURCE_INVENTORY.md
release/source-inventory.json
release/DATA_PROVENANCE_DRAFT.md
datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl

Excluded from training:

Raw secrets, API keys, OAuth tokens, private keys, cookies, and credentials.
Closed-model answers from OpenAI, Anthropic, Gemini, or similar providers as supervised completions unless terms clearly allow it.
Private client data, customer notes, contracts, raw support logs, and client-specific website copy without explicit review and consent.

Evaluation Snapshot

Local product-path evidence:

Unit tests: 65 passing.
Full local RC smoke: passed.
Router hard harness: 23/23.
Router static checks: 23/23.
Business-suite prompts: 2/2.
Local API harness: website and business-suite artifacts pass.

Merged serving evidence:

Current endpoint: http://127.0.0.1:18181/v1, forwarding to vLLM bitsandbytes on Gojira B at http://100.109.109.14:18084/v1
Served model: kaiju-coder-7
Tested context: 16384 for the current OpenCode fast path. Historical SGLang benchmark evidence includes 32768, but 32k should be freshly restarted and re-confirmed before being called the live default.
Probe: 1,155 visible chars in 60.17s.
Proposal rerun: 1/1 paid-ready, 4.0/4.0, 4,014 chars in 212.72s.
Jah credits backend: , chars in .

Known comparison caveat:

Dynamic SGLang LoRA serving is not release evidence for this adapter: adapter-name-only output can be base-equivalent, and corrected selector qwen36-27b:kaiju_v18_business_owner crashes with a fused-module LoRA buffer shape mismatch.
Do not claim raw-weight superiority until broader base-Qwen and GLM/current-production comparisons are complete.

Limitations

Raw full-website generation has not yet passed the merged-model release sweep and should remain harness-first for paid delivery.
The deterministic harness remains the practical paid website workflow.
The adapter needs a strong app layer for file editing, tool use, auth, billing, rate limits, logging, and rollback.
Public HF upload and human review are complete for testing. Real customer paid charging still requires Stripe live-mode setup and controlled live payment verification.
Not intended for high-risk medical, legal, financial, or safety-critical decisions without expert review.

Summary

The current release-candidate product path is:

text
Qwen3.6-27B base
-> Kaiju v1.8 LoRA adapter
-> merged full-model artifact for raw local serving
-> Kaiju system prompt
-> deterministic business-owner harnesses
-> verifier/static checks

Do not describe this package as raw weights alone producing every final artifact. The deterministic harness is part of the tested product path.

Base Model

Base model: Qwen/Qwen3.6-27B
Checked upstream revision: 6a9e13bd6fc8f0983b9b99948120bc37f49c13e9
Upstream license metadata: apache-2.0
Upstream license copy: release/upstream/qwen3.6-27b/LICENSE

Attribution wording:

text
Kaiju Coder 7 by Kiyomi is fine-tuned from Qwen under Apache 2.0.

Do not imply endorsement by Qwen, Alibaba, or upstream authors.

Adapter

Adapter path: runs/qwen36-27b-lora-v1.8-business-owner/adapter
Adapter type: LoRA / PEFT
LoRA rank: 16
LoRA alpha: 32
LoRA dropout: 0.02
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable parameter count: approximately

Merged Local Artifact

Remote merged path: /home/richardecholsai5/kaiju-coder/models/Kaiju-Coder-Qwen3.6-27B-v1.8-merged
Size: 51G
Shards: 14 safetensor shards plus tokenizer/config sidecars
Served model name: kaiju-coder-7
Merge script: scripts/run-gojira-b-qwen36-lora-merge.sh
Serving script: scripts/start-qwen36-merged-sglang.sh

Training

Dataset build: datasets/build/kaiju-sft-v1.7-business-owner-oversampled.jsonl
Reviewed candidate examples: 1,689
SFT rows after controlled business-owner oversampling: 1,881
Train examples: 1,769
Eval examples: 112
Training runtime: 11666.7564s
Training loss: 0.9281658741335074
Max training length: 2048
Training config: training/configs/qwen36-27b-lora-v1.8-business-owner.example.json

Data Provenance

Training data is source-backed and RMDW-owned or RMDW-authored. Client-site repositories are used only as generalized pattern/eval sources unless explicitly reviewed for training eligibility.

Relevant release files:

release/SOURCE_INVENTORY.md
release/source-inventory.json
release/DATA_PROVENANCE_DRAFT.md
datasets/candidates/v1.7-rmdw-business-owner-suite.jsonl

Excluded from training:

Raw secrets, API keys, OAuth tokens, private keys, cookies, and credentials.
Closed-model answers from OpenAI, Anthropic, Gemini, or similar providers as supervised completions unless terms clearly allow it.
Private client data, customer notes, contracts, raw support logs, and client-specific website copy without explicit review and consent.

Evaluation Snapshot

Local product-path evidence:

Unit tests: 65 passing.
Full local RC smoke: passed.
Router hard harness: 23/23.
Router static checks: 23/23.
Business-suite prompts: 2/2.
Local API harness: website and business-suite artifacts pass.

Merged serving evidence:

Current endpoint: http://127.0.0.1:18181/v1, forwarding to vLLM bitsandbytes on Gojira B at http://100.109.109.14:18084/v1
Served model: kaiju-coder-7
Tested context: 16384 for the current OpenCode fast path. Historical SGLang benchmark evidence includes 32768, but 32k should be freshly restarted and re-confirmed before being called the live default.
Probe: 1,155 visible chars in 60.17s.
Proposal rerun: 1/1 paid-ready, 4.0/4.0, 4,014 chars in 212.72s.
Jah credits backend: , chars in .

Known comparison caveat:

Dynamic SGLang LoRA serving is not release evidence for this adapter: adapter-name-only output can be base-equivalent, and corrected selector qwen36-27b:kaiju_v18_business_owner crashes with a fused-module LoRA buffer shape mismatch.
Do not claim raw-weight superiority until broader base-Qwen and GLM/current-production comparisons are complete.

Limitations

Raw full-website generation has not yet passed the merged-model release sweep and should remain harness-first for paid delivery.
The deterministic harness remains the practical paid website workflow.
The adapter needs a strong app layer for file editing, tool use, auth, billing, rate limits, logging, and rollback.
Public HF upload and human review are complete for testing. Real customer paid charging still requires Stripe live-mode setup and controlled live payment verification.
Not intended for high-risk medical, legal, financial, or safety-critical decisions without expert review.

kaiju-coder-7-adapter

Get help setting up a custom Dedicated Endpoints.

README

Summary

Base Model

Adapter

Merged Local Artifact

Training

Data Provenance

Evaluation Snapshot

Limitations

Explore FriendliAI today

README

Summary

Base Model

Adapter

Merged Local Artifact

Training

Data Provenance

Evaluation Snapshot

Limitations