mikuhhn1239

qwen3-8b-novel-base-sft

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

用途

All Novel Can Be Galgame 工作台的 Stage1 基座模型。

学习中文小说的叙事风格和角色对话模式,作为下游 LoRA adapter 的基座,执行三类 Agent 任务:

  • Agent 1: mikuhhn1239/qwen3-8b-narrative-parsing-lora — 叙事单元分类
  • Agent 2: mikuhhn1239/qwen3-8b-scene-segmentation-lora — 场景边界检测
  • Agent 3: mikuhhn1239/qwen3-8b-attribution-assist-lora — 角色归因

加载

python

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"mikuhhn1239/qwen3-8b-novel-base-sft",
torch_dtype="auto",
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(
"mikuhhn1239/qwen3-8b-novel-base-sft"
)

4-bit 量化(适用于 8GB 显存):

python

from transformers import BitsAndBytesConfig
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype="float16",
bnb_4bit_quant_type="nf4",
)
model = AutoModelForCausalLM.from_pretrained(
"mikuhhn1239/qwen3-8b-novel-base-sft",
quantization_config=bnb_config,
device_map="auto",
)

训练数据

Table
文件条数说明
continuation.jsonl36,092续写:给前半段→续后半段
instruction.jsonl36,481指令式续写
合计72,573

格式: ChatML [system, user, assistant]

训练详情

调试历程

全参微调 8B 模型内存压力大,经历多轮调试:

Table
#问题原因解决
1单卡 OOM (78G/80G)optimizer+grads+model≈116G/卡上 4 卡 DDP
24 卡 DDP 仍 OOM (79G/80G)DDP 每卡存完整 AdamW 状态(66G)加 DeepSpeed ZeRO-2
3ZeRO-2 backward OOM内存碎片PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
4batch=2 预估 26 小时序列太长seq_len 4096→2048, epochs 3→2, batch 2→4

最终超参

Table
参数
方法全参数 SFT
有效 batch size64 (4 GPU × 4 batch × 4 accumulation)
学习率2e-5
优化器AdamW (adamw_torch_fused)
warmup3%
schedulerlinear decay
精度bf16
最大序列长度2048
epochs2
gradient checkpointingTrue
分布式DeepSpeed ZeRO-2 (4×A800 80GB)

训练结果

  • 耗时: ~9 小时
  • Loss: 3.36 → 2.47
  • 产物: 16GB (4 个 safetensors 分片)

限制

  • 仅支持中文输入
  • 训练数据以网络小说为主,非通用指令模型
  • 无安全对齐,不适用于敏感内容生成

Model provider

mikuhhn1239

Model tree

Base

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today