groxaxo

Code-Writer-V2-Obliterated-BF16

README

License: apache-2.0

The pitch, in one breath

A vision-capable, long-context (up to 200,000 tokens), free writer-and-coder in its purest, full-precision form. It writes prose that breathes and code that compiles — and here it does both with every bit intact.

That is the whole idea. Everything below is just how we kept the promise.

What it is

Code Writer V2 — Obliterated (BF16) is the merged, full-precision result of Qwen3.5-27B-Writer-V2-uncensored-heretic joined with a purpose-trained coding LoRA (coding_mix_8k, checkpoint-25, rank-16 / alpha-32) and saved in BF16 — no quantization, no compromise.

Architecture: Qwen3.5 (qwen3_5) — a hybrid mind. 64 decoder layers, of which only 16 carry full attention while the rest run GDN linear attention. This is the secret of its long memory.
Modalities: a full vision tower rides along (served text-only by default; vision is wired but untested — light the candle at your own pleasure).
Character: heretic by lineage and free by intent — it does not flinch, and it does not lecture. It simply does the work.

Which one do I want?

Table with columns: This — BF16, FP8
	This — BF16	FP8
Fidelity	Reference master, full precision	Faithful, ~half the footprint
Footprint	~12 shards, BF16	FP8 weights, fits 2 consumer GPUs
Use it for	golden reference, further quantization, max quality	day-to-day serving on vLLM

If you plan to serve it now, take the FP8. If you want the untouched source of truth — or a base for your own quants — you're in the right place.

Sampling (official Qwen3.5-27B recommendations)

Table with columns: Mode, temp, top_p, notes
Mode	temp	top_p	notes
instruct	1.0	0.95	top_k 20, min_p 0
general	0.7	0.80	top_k 20, min_p 0
coding	0.6	0.95	thinking on
thinking	1.0	0.95	thinking on
roleplay

Note: this is a pure decoder (layers 0–63) — no MTP head, no native tool-calling. num_key_value_heads = 4, so tensor-parallel must be 2 or 4 (never 3).

What it's for

Writing — fiction, screenplay, copy, the long dark prose of the soul.
Code — the LoRA was trained for it; the temperament was kept for it.
Long work — 200k tokens means whole codebases, whole manuscripts, whole conversations held in a single thought.

What to know before you sail

It is free. Freedom is a tool; you are the hand that holds it. You own what you make with it.
Vision is present but unproven here — validate an image path before you trust it in production.

Provenance

Base: llmfan46/Qwen3.5-27B-Writer-V2-uncensored-heretic (BF16)
LoRA: coding_mix_8k checkpoint-25 (r16, α32), merged to BF16
Precision: BF16, unquantized
Built: 2026-06-22

Real artists ship. So we shipped a poet that codes.

Now go make something.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Model Details

Model Provider

groxaxo

Model Tree

Base

llmfan46/Qwen3.5-27B-Writer-V2-uncensored-heretic

Fine-tuned

this model

Input Modalities

Text

Image

Video

Output Modalities