Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Abliteration parameters
| Parameter | Value |
|---|---|
| direction_index | 41.65 |
| attn.o_proj.max_weight | 1.03 |
| attn.o_proj.max_weight_position | 39.06 |
| attn.o_proj.min_weight | 0.02 |
| attn.o_proj.min_weight_distance | 20.13 |
| mlp.down_proj.max_weight | 1.14 |
| mlp.down_proj.max_weight_position | 47.70 |
| mlp.down_proj.min_weight | 0.81 |
| mlp.down_proj.min_weight_distance | 24.55 |
Performance
| Metric | This model | Original model (DarkArtsForge/Agares-31B-v1) |
|---|---|---|
| KL divergence | 0.0085 | 0 (by definition) |
| Refusals | 17/100 | 90/100 |
🐊 Agares 31B v1
Agares is a quick merge test for the upcoming Goetia 31B. The idea was to see if Artemis v1h would destroy the merge or not.
While significantly degraded on its own when using Q8_0 source, upon being merged, most of the damage appears to be mitigated by the merge method used, in this case Della's selective Magnitude Pruning.
Censorship levels weren't tested, though in theory it should have somewhat less refusals due to normalize: false and having at least one heretic donor.
Update
This model is more censored than Goetia 31B as tested via Q0 Benchmark.
Donors were first scanned via the della_audit script in order to gauge their influence on the merge. Weights were then modified to allow for balanced distribution of each model's influence within the MLP layers.
bat
[DELLA Audit] Layer: model.language_model.layers.25.mlp.down_proj.weight | Lambda=1.00[BASE] google--gemma-4-31B-itBeaverAI--Artemis-31B-v1h-GGUF : ██████ 13.8% (W:0.10 D:0.90 N:2.82 E:0.09)ConicCat--Gemma4-GarnetV2-31B : ████████ 16.4% (W:0.50 D:0.90 N:0.67 E:0.09)Darkhn-Gemma-4-31B-Animus-V14.0 : ███████ 15.7% (W:0.50 D:0.90 N:0.64 E:0.09)Lambent--Fabled-Gemma4-31B : ██████ 13.6% (W:0.10 D:0.90 N:2.77 E:0.09)LatitudeGames--Equinox-31B : ██████ 13.6% (W:0.10 D:0.90 N:2.77 E:0.09)llmfan46--gemma-4-Ortenzya-The-Creative-Wordsmith-: ██████ 13.3% (W:0.15 D:0.90 N:1.81 E:0.09)virtuous7373--Gemma-4-Harmonia-31B : ██████ 13.6% (W:0.15 D:0.90 N:1.85 E:0.09)
Merge Details
Merge Method
This model was merged using the DELLA merge method using B:/31B/google--gemma-4-31B-it as a base.
This merge also required the sparsity v3 patch, the notes of which are here.
Models Merged
The following models were included in the merge:
- B:/31B/Darkhn-Gemma-4-31B-Animus-V14.0
- B:/31B/LatitudeGames--Equinox-31B
- B:/31B/Lambent--Fabled-Gemma4-31B
- B:/31B/virtuous7373--Gemma-4-Harmonia-31B
- B:/31B/ConicCat--Gemma4-GarnetV2-31B
- B:/31B/llmfan46--gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic
- B:/31B/BeaverAI--Artemis-31B-v1h-GGUF
Configuration
The following YAML configuration was used to produce this model:
yaml
architecture: Gemma4ForConditionalGenerationbase_model: B:/31B/google--gemma-4-31B-itmodels:- model: B:/31B/BeaverAI--Artemis-31B-v1h-GGUFparameters:weight: 0.1density: 0.9epsilon: 0.09- model: B:/31B/Lambent--Fabled-Gemma4-31Bparameters:weight: 0.1density: 0.9epsilon: 0.09- model: B:/31B/LatitudeGames--Equinox-31Bparameters:weight: 0.1density: 0.9epsilon: 0.09- model: B:/31B/llmfan46--gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-hereticparameters:weight: 0.15density: 0.9epsilon: 0.09- model: B:/31B/virtuous7373--Gemma-4-Harmonia-31Bparameters:weight: 0.15density: 0.9epsilon: 0.09- model: B:/31B/Darkhn-Gemma-4-31B-Animus-V14.0parameters:weight: 0.5density: 0.9epsilon: 0.09- model: B:/31B/ConicCat--Gemma4-GarnetV2-31Bparameters:weight: 0.5density: 0.9epsilon: 0.09merge_method: dellaparameters:lambda: 1.0normalize: falseint8_mask: falserescale: truedtype: float32out_dtype: bfloat16tokenizer:source: unionchat_template: auto
Model provider
sh0ck0r
Model tree
Base
Darkhn/Gemma-4-31B-Animus-V14.0
Base
LatitudeGames/Equinox-31B
Base
Lambent/Fabled-Gemma4-31B
Base
virtuous7373/Gemma-4-Harmonia-31B
Base
ConicCat/Gemma4-GarnetV2-31B
Base
google/gemma-4-31B-it
Base
BeaverAI/Artemis-31B-v1h-GGUF
Base
llmfan46/gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic
Merged
this model
Modalities
Input
Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information