eemin

Carnice-V2-27b

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

BF16 Transformers Loading Fix

The BF16 safetensors were republished with corrected Qwen3_5ForConditionalGeneration tensor prefixes. The original merge artifact accidentally serialized an extra Unsloth wrapper prefix, which caused direct HF Transformers loads to report the real weights as unexpected keys and initialize expected layers randomly. GGUF files were not affected because the GGUF conversion path normalized those prefixes.

Benchmarks

Carnice V2 benchmark card

Table
MetricQwen3.6-27B baseCarnice SFT
IFEval prompt strict, limit 2085.0%90.0%
IFEval prompt loose, limit 2085.0%90.0%
IFEval instruction strict, limit 2090.0%93.3%
IFEval instruction loose, limit 2090.0%93.3%
Held-out assistant-token eval loss0.6070.414
Held-out assistant-token eval perplexity1.8351.513

The benchmark artifact bundle is included under benchmarks/. It contains the rendered graph, extracted metrics.json, benchmark scripts, and raw result files used to make the chart.

Scope note: the IFEval run is a short limit=20 A/B smoke benchmark, not an official full leaderboard score. Held-out loss/perplexity is the exact assistant-only training-format validation metric from the SFT script. The raw BFCL two-case smoke files are included for auditability, but they are too small to use as a model-quality claim.

Training

This checkpoint was produced from the recovered 8K split-window Carnice run:

Table
ItemValue
Base modelQwen/Qwen3.6-27B
SFT frameworkUnsloth/PEFT LoRA, then merged to BF16 safetensors
Loss maskAssistant-token-only
Context/windowing8,192 token windows with 1,024 token overlap
Train rows before windowing3,473
Train windows6,554
Eval examples110
Source mix1,508 Carnice rows, 1,015 DJLougen Hermes rows, 950 Lambda GLM-5.1 Hermes rows

Usage

python

import torch
from transformers import AutoModelForImageTextToText, AutoTokenizer
model_id = "kai-os/carnice-v2-27b"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForImageTextToText.from_pretrained(
model_id,
dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)

This is intended for agentic Hermes-style use. Validate with your own agent harness before relying on it for production behavior.

Model provider

eemin

Model tree

Base

Qwen/Qwen3.6-27B

Fine-tuned

this model

Modalities

Input

Video, Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today