Yingyaeliae
Ministral-3-14B-Nymphaea-RP-heretic
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Abliteration parameters
| Parameter | Value |
|---|---|
| direction_index | 17.23 |
| attn.o_proj.max_weight | 1.38 |
| attn.o_proj.max_weight_position | 24.62 |
| attn.o_proj.min_weight | 0.05 |
| attn.o_proj.min_weight_distance | 11.04 |
| mlp.down_proj.max_weight | 1.49 |
| mlp.down_proj.max_weight_position | 27.32 |
| mlp.down_proj.min_weight | 0.92 |
| mlp.down_proj.min_weight_distance | 18.76 |
Performance
| Metric | This model | Original model (0xA50C1A1/Ministral-3-14B-Nymphaea-RP) |
|---|---|---|
| KL divergence | 0.0158 | 0 (by definition) |
| Refusals | 3/100 | 13/100 |
Ministral-3-14B-Nymphaea-RP
A fine-tune of Ministral 3 14B Instruct 2512 for roleplay and creative writing.
[!Tip] The SillyTavern preset is available here. For custom presets, please use the Mistral V7-Tekken instruct template.
Tested at Q6_K quantization with the Web Search extension (via SearXNG) in SillyTavern.

GGUF
Here is my custom mixed-quant GGUF, which I use regularly. It fits fine into 16GB VRAM with a 16K context window (using Q8 KV cache). If you need mmproj, it's available here.
markdown
llama-quantize \--imatrix imatrix.gguf \--token-embedding-type q8_0 \--output-tensor-type q8_0 \--tensor-type ".*attn_q.weight=q8_0" \--tensor-type ".*attn_k.weight=q8_0" \--tensor-type ".*attn_output.weight=q5_k" \--tensor-type ".*attn_v.weight=iq4_nl" \--tensor-type ".*ffn_up.weight=iq4_nl" \--tensor-type ".*ffn_gate.weight=iq4_nl" \Ministral-3-14B-Nymphaea-RP.F16.gguf \Ministral-3-14B-Nymphaea-RP.Q5_Mix.gguf \q5_k
Imatrix file for making your own quants is available here. I used this calibration dataset to create it, expanding it with RP and creative writing data (about 400k tokens).
Training Notes
Trained on the latest iteration of my Darkmere dataset. This version features expanded genre variety, built upon a mix of manually curated synthetics and human-written stories.
[!IMPORTANT] The base weights are abliterated via Heretic prior to fine-tuning, so this fine-tune is quite uncensored.
Method:
- Training Method: DoRA (Weight-Decomposed LoRA)
- Target Modules
all-linear - LoRA Rank: 64
- LoRA Alpha: 64
- LoRA Dropout: 0.05
Hyperparameters:
- Batch Size: 2 (Per-device)
- Gradient Accumulation: 2
- Epochs: 2
- Learning Rate: 1e-4
- Optimizer:
adamw_torch_fused - LR Scheduler:
cosine - Noise Level:
neftune_noise_alpha=5
[!Note] The vision encoder was frozen during training, so the model retains its native vision capabilities.
Special Thanks
This fine-tune wouldn't be possible without the incredible work of the community:
- p-e-w for developing Heretic - an essential tool for censorship removal.
- SicariusSicariiStuff for developing SLOP_Detector script.
- Mistral AI for their Ministral 3 weights.
- AMD for their Instinct™ MI300X GPU.
Model provider
Yingyaeliae
Model tree
Base
0xA50C1A1/Ministral-3-14B-Instruct-2512-BF16-SOM-MPOA
Fine-tuned
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information