Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: mit🔮 Roadmap / 预告
This is one of the last one or two QLoRA fine-tunes built on Qwen3.6-27B. What comes next depends on the base-model timeline:
- Primary plan — wait for Qwen3.7-27B. Once Qwen3.7-27B is available, new releases will move to that base. This kicks off a v3 line: more rigorous tuning aimed at near-flawless, stable output (the realistic-crime adapter
yuxinlu1/qwen3-6-27b-chinese-crime-fiction-lora-v2is the reference for that next step), together with upgrading several existing adapters to Qwen3.7. - Interim fallback. If Qwen3.7-27B does not arrive within the next week or so, a previously finished fine-tune from the backlog will be released in the meantime — note that this one targets an uncensored base model.
Alongside the model work, the companion writing pipeline (github.com/DuckTraDo/Novel) is getting a major overhaul focused on ease of use. It is expected to ship as a packaged application, opening up features such as automatic continuation / auto-drafting.
A LoRA adapter that guides Qwen3.6-27B toward Chinese xianxia / cultivation prose (玄幻修仙) — a long-form fantasy-cultivation tradition built around immortal-seeking, clan and sect politics, and the slow climb through cultivation realms.
This adapter is designed for a specific writing problem: Given a short instruction or a neutral scene description, the model should directly produce Chinese cultivation-fiction prose grounded in the xianxia register, rather than analysis, outline, summary, or generic fantasy pastiche.
The core direction is intentionally narrow:
- third-person narration that follows characters' inner judgment as they scheme, cultivate, and size each other up
- cultivation-world settings: secluded caves and closed-door cultivation (洞府/闭关), sects and great clans (宗门/世家), mountain gates, spirit veins, market towns (坊市)
- xianxia imagery grounded in a cultivation cosmology (灵气, 境界, 法器, 丹药, 神识, 御风/驾风, 玉简, 剑意) rather than Western fantasy tropes (no wizards, elves, or Western dragons)
- cultivation jargon woven into the narration itself, not just the dialogue — realm names, technique names, and artifact lore read as native to the voice
- direct fiction prose, not analysis, outlines, or revision advice
- cultivation and power progression treated as a vehicle for depicting clan politics, ambition, and human nature, not as a stat sheet
The goal is for outputs to behave like usable Chinese cultivation-fiction scene drafts, and less like a generic writing assistant explaining what xianxia is.
Typical use cases:
- drafting cultivation/xianxia scenes inside a longer novel project
- rewriting modern-toned paragraphs into cultivation-register prose
- sect / clan / closed-door-breakthrough fiction in a long-form writing pipeline
- local and private creative-writing workflows
- integration with a long-form novel pipeline that manages outline, memory, timeline, and continuity
This is an adapter only. It does not include base model weights, training data, or copyrighted source material.
🧭 About the v1 / v2 Series
This repository is part of a small series of Chinese fiction LoRAs (现代悬疑 realistic crime, 民俗志怪 folk-horror, and now 玄幻修仙 xianxia).
- v1 — Style Retraining. The v1 series focuses mainly on prose style. It uses SFT to move the base model away from generic AI prose toward a specific Chinese literary voice.
- v2 — Expanded Training & Full Release. The v2 line uses a larger, cleaner training set, longer SFT, and ships a complete release (PEFT safetensors + GGUF). Behavioral DPO refinement is applied where available.
Note on this release: this xianxia adapter is the SFT stage (QLoRA, 2 epochs on a chapter-level cultivation corpus). It is labeled v2 for its expanded training and full safetensors release. DPO behavioral refinement has not yet been applied and is planned as a follow-up iteration.
Available formats: HF PEFT safetensors (and a GGUF LoRA once converted). MLX users may also be able to use the PEFT safetensors through mlx-lm depending on their local setup.
🌱 Status
| Field | Value |
|---|---|
| Version | v2 (SFT stage) |
| Focus | Chinese xianxia / cultivation prose behavior |
| Format | HF PEFT safetensors (GGUF LoRA planned) |
| Base model | Qwen3.6-27B |
| Language | Chinese |
| Use case | cultivation scene drafting, sect/clan fiction, realm-progression prose |
| Training style | SFT (QLoRA, r=16 / α=32, 2 epochs, max_seq=3300); DPO planned |
| Recommended workflow | local long-form writing pipeline |
🔗 Companion Novel Pipeline
This LoRA is designed to work together with a local-first, long-form Chinese novel writing pipeline: github.com/DuckTraDo/Novel
The pipeline handles the structural side of long-form fiction:
- outline and chapter planning
- scene-level context assembly
- story memory and character tracking
- timeline and continuity checks
This LoRA handles the prose-behavior side:
- third-person cultivation narration with character interiority
- xianxia scene drafting (closed-door breakthrough, sect confrontation, artifact lore)
- cultivation-register diction woven into narration
- anti-modern-tone and anti-Western-fantasy-trope
They are intentionally split: the pipeline owns what happens, the LoRA owns how it reads on the page. You can use either independently, but they are designed as a pair.
📦 Files
Files included: adapter_config.json, adapter_model.safetensors, tokenizer.json, tokenizer_config.json, chat_template.jinja, README.md.
A GGUF LoRA (*-f16.gguf) for llama.cpp can be generated from the adapter and will be added to this repo; see the llama.cpp section below.
⚠️ Critical Usage Note: enable_thinking=False
Qwen3.6 ships with a thinking mode enabled by default. When you call apply_chat_template(messages, add_generation_prompt=True) with default settings, the chat template leaves an unclosed <think>\n block before the assistant's turn. With this LoRA loaded, that causes outputs to begin with an English thinking-process preamble (e.g. "Here's a thinking process: 1. Analyze the user input...") instead of Chinese cultivation prose.
The training samples used a closed empty thinking block: <think>\n\n</think>\n\n followed by the actual prose.
Always pass enable_thinking=False:
python
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,)
For llama.cpp / GGUF inference, pass the included chat_template.jinja via --chat-template-file, or manually prepend the assistant prefix <think>\n\n</think>\n\n in the prompt.
This is the single most common usage error. If outputs look like English reasoning instead of Chinese prose, this is almost certainly the cause.
🚀 Example PEFT / Transformers Usage
python
import torchfrom transformers import AutoModelForCausalLM, AutoTokenizerfrom peft import PeftModelbase_model, adapter = "Qwen/Qwen3.6-27B", "yuxinlu1/qwen3-6-27b-chinese-xianxia-lora-v2"tokenizer = AutoTokenizer.from_pretrained(adapter, trust_remote_code=True)base = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True,)model = PeftModel.from_pretrained(base, adapter).eval()messages = [{"role": "user", "content": "写一段玄幻修仙小说。"}]text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False,)inputs = tokenizer(text, return_tensors="pt").to(model.device)output = model.generate(**inputs,max_new_tokens=640,temperature=0.8,top_p=0.9,repetition_penalty=1.15,do_sample=True,pad_token_id=tokenizer.eos_token_id,)print(tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
🚀 Example llama.cpp Usage
Pair this GGUF LoRA with a compatible Qwen3.6-27B GGUF base. Unsloth's Qwen3.6-27B-GGUF (Q4_K_M or Q8_0) is recommended.
bash
llama-cli -m qwen3.6-27b-base.Q4_K_M.gguf \--lora qwen3-6-27b-chinese-xianxia-lora-v2-f16.gguf \--chat-template-file chat_template.jinja \-p "写一段玄幻修仙小说。" \-n 640 --temp 0.8 --top-p 0.9 --repeat-penalty 1.15
🧪 Example Prompts
Prompts can be short:
markdown
写一段玄幻修仙小说。讲一个修仙者的故事。深山一处石洞里,洞壁上的灵光忽明忽暗,映着满地碎石。小说:少年盘膝而坐,体内那缕灵气又凝实了几分。把下面这段改成玄幻修仙小说正文:他打开电脑准备加班,忽然觉得胸口一热,等回过神来,发现自己躺在一间陌生的土屋里。
Suggested decoding range (tuned for this adapter):
| Setting | Range |
|---|---|
| temperature | 0.7 – 0.85 |
| top_p | 0.85 – 0.92 |
| repetition_penalty | 1.1 – 1.18 |
| max_new_tokens | 400 – 700 |
repetition_penalty matters here: too low (≈1.05) lets long generations fall into verbatim loops; too high (≈1.2+) can push Chinese character names toward homophone drift. ~1.15 is a good default. The adapter is strongest in shorter bursts — very long single generations (>~700 tokens) may drift; generate scene-by-scene for best results.
🧪 Internal Evaluation Snapshot
A small internal evaluation was run across four prompt tiers (original-text continuation, neutral scene description, character-grounded task, and minimal generic instruction) to verify style transfer.
Qualitative observations:
- Original-text continuation: natural continuation in the trained cultivation voice; realm/technique/artifact vocabulary and the xianxia register carry through.
- Neutral scene description: cultivation atmosphere and plot emerge from neutral prompts without explicit xianxia cues in the input.
- Character-grounded task: characters behave consistently with cultivation-fiction conventions (sect etiquette, closed-door breakthrough, clan politics).
- Minimal generic instruction (e.g. "write a piece of xianxia fiction"): the model produces cultivation prose directly without first explaining what xianxia is — confirming the style was internalized, not merely instruction-followed.
These observations come from a small local development eval and should be treated as an internal signal, not a public benchmark.
🧪 Intended Use
Intended for:
- local Chinese xianxia / cultivation drafting
- rewriting modern-toned paragraphs into cultivation-register prose
- testing third-person cultivation narration
- sect / clan / breakthrough fiction in a human-in-the-loop workflow
- offline and privacy-respecting novel drafting
- integration with a long-form novel pipeline
Not intended for: author impersonation, defamation, harassment, factual claims, high-stakes advice, spam, or deception.
⚠️ Limitations
- It does not guarantee full-novel plot coherence by itself. Character continuity, foreshadowing, and timeline logic should be handled by an external writing pipeline or by the author.
- Long single generations may degrade. Past roughly 600–700 tokens, very long outputs can drift, loop, or destabilize character names. Generate scene-by-scene and keep
repetition_penaltynear 1.15. - Character-name stability depends on decoding settings; aggressive repetition penalties can turn a name into a homophone variant.
- Optimized for cultivation-world settings; outputs for modern, urban, or non-cultivation scenes may revert toward base-model tone.
- It may produce shorter-than-expected outputs if the prompt is very minimal or decoding settings are conservative.
- If
enable_thinking=Falseis not set, outputs will begin with an English thinking-process preamble. This is the single most common usage error. - Output quality depends on the base model, quantization, sampling settings, prompt design, and context quality.
🛡️ Safety and Legal Notes
- This repository contains adapter weights and supporting tokenizer/template files, not base model weights.
- No copyrighted novels, private manuscripts, or proprietary datasets are distributed in this repository.
- This LoRA is not designed to imitate any specific living author; the goal is to capture a broader Chinese xianxia / cultivation prose tradition.
- Outputs are machine-generated fiction; do not use this model for harassment, defamation, fraud, or deceptive impersonation.
- Users are responsible for the base model license, adapter license, and applicable law.
📜 License
LoRA adapter: MIT. Base model: governed by its own license.
Model provider
yuxinlu1
Model tree
Base
Qwen/Qwen3.6-27B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information