jianghan-runtime-v5-lora API & Inference Endpoint

Usage

bash
git clone https://github.com/zuiho-kai/jianghan-roleplay-data-pipeline.git
cd jianghan-roleplay-data-pipeline

python scripts/jianghan_runtime_chat.py \
  --model Qwen/Qwen3.5-4B \
  --adapter zuiho-kai/jianghan-runtime-v5-lora \
  --profile role \
  --stage 第三阶段 \
  --prompt 猫灯们把简单账目做成了艺术展，你要处理这件事。 \
  --hide-stage-prefix \
  --retry 2

Current promoted stack:

text
Qwen/Qwen3.5-4B + v5 checkpoint-150 + runtime_rag_v1 + phase-hidden + audit/retry

Quick online deploy

For model + runtime RAG HTTP serving, use the GitHub quick deploy guide:

https://github.com/zuiho-kai/jianghan-roleplay-data-pipeline/blob/main/docs/QUICK_DEPLOY.md

The service exposes GET /health, POST /rag, and POST /chat.

Runtime RAG files

The adapter does not contain the RAG index. Runtime RAG code and the minimal deployable index live in the GitHub repo:

GitHub runtime repo: https://github.com/zuiho-kai/jianghan-roleplay-data-pipeline
scripts/jianghan_runtime_rag.py
data/world/worldbook/worldbook_knowledge_index_v1.jsonl
data/runtime/jianghan_phase_context_v1.jsonl
data/runtime/jianghan_stage3_runtime_policy_v1.md

RAG-only smoke test:

bash
python scripts/jianghan_runtime_rag.py \
  --prompt "猫灯和奥维利亚是什么关系？" \
  --stage 第三阶段 \
  --profile fact

Usage

bash
git clone https://github.com/zuiho-kai/jianghan-roleplay-data-pipeline.git
cd jianghan-roleplay-data-pipeline

python scripts/jianghan_runtime_chat.py \
  --model Qwen/Qwen3.5-4B \
  --adapter zuiho-kai/jianghan-runtime-v5-lora \
  --profile role \
  --stage 第三阶段 \
  --prompt 猫灯们把简单账目做成了艺术展，你要处理这件事。 \
  --hide-stage-prefix \
  --retry 2

Current promoted stack:

text
Qwen/Qwen3.5-4B + v5 checkpoint-150 + runtime_rag_v1 + phase-hidden + audit/retry

Quick online deploy

For model + runtime RAG HTTP serving, use the GitHub quick deploy guide:

https://github.com/zuiho-kai/jianghan-roleplay-data-pipeline/blob/main/docs/QUICK_DEPLOY.md

The service exposes GET /health, POST /rag, and POST /chat.

Runtime RAG files

The adapter does not contain the RAG index. Runtime RAG code and the minimal deployable index live in the GitHub repo:

GitHub runtime repo: https://github.com/zuiho-kai/jianghan-roleplay-data-pipeline
scripts/jianghan_runtime_rag.py
data/world/worldbook/worldbook_knowledge_index_v1.jsonl
data/runtime/jianghan_phase_context_v1.jsonl
data/runtime/jianghan_stage3_runtime_policy_v1.md

RAG-only smoke test:

bash
python scripts/jianghan_runtime_rag.py \
  --prompt "猫灯和奥维利亚是什么关系？" \
  --stage 第三阶段 \
  --profile fact

jianghan-runtime-v5-lora

README

Usage

Quick online deploy

Runtime RAG files

Explore FriendliAI today

README

Usage

Quick online deploy

Runtime RAG files