Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Which artifact do you want?

you wantuse
Just run the modelgr33r/ux-writing-1 (merged, vanilla transformers)
Run it on a laptopgr33r/ux-writing-1-GGUF (Q4_K_M 16.6 GB)
This repo: attach to base / continue trainingthe 159 MB LoRA (r=16 α=32, LM projections)

Usage (PEFT)

python

import torch
from peft import PeftModel
from transformers import AutoModelForImageTextToText, AutoTokenizer
base = "Qwen/Qwen3.6-27B"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForImageTextToText.from_pretrained(base, dtype=torch.bfloat16, device_map="auto")
model = PeftModel.from_pretrained(model, "gr33r/ux-writing-1-lora")
# Prompt contract + enable_thinking=False guidance: see the merged model card.

Training: QLoRA (4-bit NF4, double-quant, bf16 compute), LoRA on q,k,v,o,gate,up,down projections, 2 epochs on ≈1,400 owner-authored/derived rewrite pairs, one A100-80GB. To fine-tune further on your style guide (≈$2–6 on HF Jobs), see FINETUNE_GUIDE.md.

License: Apache-2.0. Attribution appreciated: ux-writing-1 by gr33r.

Model provider

build-small-hackathon

Model tree

Base

Qwen/Qwen3.6-27B

Adapter

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today