mikuhhn1239
qwen3-8b-narrative-parsing-lora
Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Container
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0任务
- 输入: 编号叙事单元
[1] "..." [2] "..." ... - 输出:
{"labels": [{"unit_id": "N", "type": "dialogue|narration|thought|action|scene_description"}]} - 测试集: 39 条
五种类型
| 类型 | 含义 |
|---|---|
dialogue | 对话 |
narration | 叙述 |
thought | 心理 |
action | 动作 |
scene_description | 场景描写 |
示例
markdown
输入:[1] "你怎么来了?"[2] 她愣了一下。[3] 其实我也不知道自己为什么会来。输出:{"labels": [{"unit_id": "1", "type": "dialogue"},{"unit_id": "2", "type": "action"},{"unit_id": "3", "type": "thought"}]}
加载
python
from transformers import AutoModelForCausalLMfrom peft import PeftModelbase = AutoModelForCausalLM.from_pretrained("mikuhhn1239/qwen3-8b-novel-base-sft",torch_dtype="auto", device_map="auto",)model = PeftModel.from_pretrained(base, "mikuhhn1239/qwen3-8b-narrative-parsing-lora")
训练
markdown
基座: Qwen3-8B-Novel-Base-SFT (Stage1 全参 SFT, 72K 小说续写数据)方法: LoRA (r=64, α=128, dropout=0.05)数据: 616 条 (577 train / 39 val / 39 test)框架: transformers Trainer + PEFT优化器: AdamW (adamw_torch_fused), cosine schedule, warmup=5%epoch: 5 | LR: 1e-4 | batch: 1×16(accum) | bf16 | max_length: 4096
版本历史
| 版本 | 数据量 | epochs | LR | JSON解析 | 类型准确率 | 说明 |
|---|---|---|---|---|---|---|
| 零基座 | — | — | — | 0% | 0% | Qwen3-8B 原始完全不会 |
| +Stage1 | — | — | — | 0% | 0% | 读完 669 本也不会 |
| v1 | 56 | 3 | 2e-4 | 57.1% | 25.0% | 端到端(切分+分类) |
| v2 | 310 | 3 | 2e-4 | 2.6% | 63.6% | 只分类,引号冲突 |
| v3.1 | 310 | 5 | 1e-4 | 2.6% | 63.6% | max_new_tokens=256 截断 JSON |
| v3.2 ⭐ | 577 | 5 | 1e-4 | 100% | 69.5% | 引号修复 + 扩标 + tokens→1024 |
结论
- 零基座 / +Stage1 全 0%:不做 Agent SFT 就不会叙事分类 ✅
- v3.2 突破:JSON 解析 2.6%→100%,类型准确率 63.6%→69.5%(+5.9pp)
- 关键修复: 输入引号
""→「」+ max_new_tokens 256→1024
Model provider
mikuhhn1239
Model tree
Base
mikuhhn1239/qwen3-8b-novel-base-sft
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information