eemin
Carnice-V2-27b
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0BF16 Transformers Loading Fix
The BF16 safetensors were republished with corrected Qwen3_5ForConditionalGeneration tensor prefixes. The original merge artifact accidentally serialized an extra Unsloth wrapper prefix, which caused direct HF Transformers loads to report the real weights as unexpected keys and initialize expected layers randomly. GGUF files were not affected because the GGUF conversion path normalized those prefixes.
Benchmarks

| Metric | Qwen3.6-27B base | Carnice SFT |
|---|---|---|
| IFEval prompt strict, limit 20 | 85.0% | 90.0% |
| IFEval prompt loose, limit 20 | 85.0% | 90.0% |
| IFEval instruction strict, limit 20 | 90.0% | 93.3% |
| IFEval instruction loose, limit 20 | 90.0% | 93.3% |
| Held-out assistant-token eval loss | 0.607 | 0.414 |
| Held-out assistant-token eval perplexity | 1.835 | 1.513 |
The benchmark artifact bundle is included under benchmarks/. It contains the rendered graph, extracted metrics.json, benchmark scripts, and raw result files used to make the chart.
Scope note: the IFEval run is a short limit=20 A/B smoke benchmark, not an official full leaderboard score. Held-out loss/perplexity is the exact assistant-only training-format validation metric from the SFT script. The raw BFCL two-case smoke files are included for auditability, but they are too small to use as a model-quality claim.
Training
This checkpoint was produced from the recovered 8K split-window Carnice run:
| Item | Value |
|---|---|
| Base model | Qwen/Qwen3.6-27B |
| SFT framework | Unsloth/PEFT LoRA, then merged to BF16 safetensors |
| Loss mask | Assistant-token-only |
| Context/windowing | 8,192 token windows with 1,024 token overlap |
| Train rows before windowing | 3,473 |
| Train windows | 6,554 |
| Eval examples | 110 |
| Source mix | 1,508 Carnice rows, 1,015 DJLougen Hermes rows, 950 Lambda GLM-5.1 Hermes rows |
Usage
python
import torchfrom transformers import AutoModelForImageTextToText, AutoTokenizermodel_id = "kai-os/carnice-v2-27b"tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)model = AutoModelForImageTextToText.from_pretrained(model_id,dtype=torch.bfloat16,device_map="auto",trust_remote_code=True,)
This is intended for agentic Hermes-style use. Validate with your own agent harness before relying on it for production behavior.
Model provider
eemin
Model tree
Base
Qwen/Qwen3.6-27B
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information