Yingyaeliae

Ministral-3-14B-Nymphaea-RP-heretic

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Abliteration parameters

Table
ParameterValue
direction_index17.23
attn.o_proj.max_weight1.38
attn.o_proj.max_weight_position24.62
attn.o_proj.min_weight0.05
attn.o_proj.min_weight_distance11.04
mlp.down_proj.max_weight1.49
mlp.down_proj.max_weight_position27.32
mlp.down_proj.min_weight0.92
mlp.down_proj.min_weight_distance18.76

Performance

Table
MetricThis modelOriginal model (0xA50C1A1/Ministral-3-14B-Nymphaea-RP)
KL divergence0.01580 (by definition)
Refusals3/10013/100

Ministral-3-14B-Nymphaea-RP

A fine-tune of Ministral 3 14B Instruct 2512 for roleplay and creative writing.

[!Tip] The SillyTavern preset is available here. For custom presets, please use the Mistral V7-Tekken instruct template.

Tested at Q6_K quantization with the Web Search extension (via SearXNG) in SillyTavern.

SillyTavern Screenshot

GGUF

Here is my custom mixed-quant GGUF, which I use regularly. It fits fine into 16GB VRAM with a 16K context window (using Q8 KV cache). If you need mmproj, it's available here.

markdown

llama-quantize \
--imatrix imatrix.gguf \
--token-embedding-type q8_0 \
--output-tensor-type q8_0 \
--tensor-type ".*attn_q.weight=q8_0" \
--tensor-type ".*attn_k.weight=q8_0" \
--tensor-type ".*attn_output.weight=q5_k" \
--tensor-type ".*attn_v.weight=iq4_nl" \
--tensor-type ".*ffn_up.weight=iq4_nl" \
--tensor-type ".*ffn_gate.weight=iq4_nl" \
Ministral-3-14B-Nymphaea-RP.F16.gguf \
Ministral-3-14B-Nymphaea-RP.Q5_Mix.gguf \
q5_k

Imatrix file for making your own quants is available here. I used this calibration dataset to create it, expanding it with RP and creative writing data (about 400k tokens).

Training Notes

Trained on the latest iteration of my Darkmere dataset. This version features expanded genre variety, built upon a mix of manually curated synthetics and human-written stories.

[!IMPORTANT] The base weights are abliterated via Heretic prior to fine-tuning, so this fine-tune is quite uncensored.

Method:

  • Training Method: DoRA (Weight-Decomposed LoRA)
  • Target Modules all-linear
  • LoRA Rank: 64
  • LoRA Alpha: 64
  • LoRA Dropout: 0.05

Hyperparameters:

  • Batch Size: 2 (Per-device)
  • Gradient Accumulation: 2
  • Epochs: 2
  • Learning Rate: 1e-4
  • Optimizer: adamw_torch_fused
  • LR Scheduler: cosine
  • Noise Level: neftune_noise_alpha=5

[!Note] The vision encoder was frozen during training, so the model retains its native vision capabilities.

Special Thanks

This fine-tune wouldn't be possible without the incredible work of the community:

Model provider

Yingyaeliae

Model tree

Base

0xA50C1A1/Ministral-3-14B-Instruct-2512-BF16-SOM-MPOA

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today