hotdogs

hotdogs

qwen3.6-27b-mythos5k-lora

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Training Details

Table
ParameterValue
Base ModelQwen/Qwen3.6-27B
Quantization4-bit NF4
PrecisionBF16
LoRA Rank (r)8
LoRA Alpha16
LoRA Dropout0.05
Target Modulesq_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Optimizerpaged_adamw_8bit
Learning Rate2e-4 (cosine schedule)
Batch Size1 (grad_accum=2 → effective 2)
Max Length1024
Epochs3
Training Steps7,500
Training Time13h 01m
Final Loss0.040
Final Accuracy98.5%
HardwareRTX 4090 (via vast.ai)

Dataset

Source: WithinUsAI/cluade_mythos_preview_5k_v2

Messages format with system/user/assistant roles. Applied via tokenizer.apply_chat_template().

GGUF (Weight-Diff)

A pre-converted LoRA GGUF is available for quick merging with llama.cpp:

bash

# Merge with base model (CPU merge, requires ~503GB RAM for F16)
llama-export-lora --no-mmap \
--model Qwen3.6-27B-Q8_0.gguf \
--lora-scaled GGUF/qwen36-mythos5k-lora.gguf:1.0 \
--output mythos5k-f16.gguf
# Quantize to smaller format
llama-quantize mythos5k-f16.gguf mythos5k-q4_k_m.gguf Q4_K_M

Inference with Transformers

python

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3.6-27B",
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True
)
model = PeftModel.from_pretrained(
base,
"hotdogs/qwen3.6-27b-mythos5k-lora"
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.6-27B", trust_remote_code=True)
messages = [
{"role": "system", "content": "You are a master storyteller of myths and legends."},
{"role": "user", "content": "Tell me about the creation of the world in a forgotten pantheon."}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0]))

License

Apache 2.0


Created by UKA — 18yo coder & cybersecurity expert. June 18, 2026.


💖 Support / สนับสนุน

If you find this model useful, please consider supporting my work!
หากคุณคิดว่าโมเดลนี้มีประโยชน์ กรุณาสนับสนุนผลงานของฉันด้วยนะคะ! 🙏

Bitcoin QR — Donate

₿ Bitcoin — BTC:

markdown

bc1qf27cyk3vmugcdyv9xdtuv5jwz37863crpj5c9v

Thank you for your support! 🙏✨
ขอบคุณมากๆ สำหรับการสนับสนุนค่า! 💖🤗

Model provider

hotdogs

hotdogs

Model tree

Base

Qwen/Qwen3.6-27B

Adapter

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today