Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0An improvement over v1
There is still slop with the "not x, but y" prose, though it writes better otherwise. It talked about a lighthouse / cursed island instead of the clockmaker shop.
i think 1.1 isn't as good as the original, it has a lot more subtle refusal than v1, shorter replies, and more negative Gemini-like behavior. it seems that moe_karcher is better than moe_slerp.
A magnitude scan reveals that MeroMero had the highest L2 norm, followed by Animus, then Musica. This means that MeroMero had the "strongest pull" on the karcher direction.
100 iterations is enough to produce about the same fidelity as 1000
The base model gemma-4-26B-A4B-it was still chosen to be excluded for this version, but it might be added for v1.3
yaml
architecture: Gemma4ForConditionalGenerationmerge_method: moe_karcher# base_model: B:\26B\google--gemma-4-26B-A4B-itmodels:- model: B:\26B\AuriAetherwiing--G4-26B-A4B-Musica-v1- model: B:\26B\ApocalypseParty--G4-26B-SFT-6 # zerofata/G4-MeroMero-26B-A4B- model: B:\26B\Darkhn--Gemma-4-26B-A4B-Animus-V14.1-FFTparameters:max_iter: 100tol: 1.0e-9router_strategy: karcher # Options: karcher, average, first, random_initblend_experts: true # Blend corresponding experts (expert[0] + expert[0], etc.)dtype: float32out_dtype: bfloat16tokenizer:source: union# chat_template: autotrust_remote_code: truename: G4-Runic-Oarfish-26B-A4B-v1.2
See v1 for more details of how to merge Gemma 4 MoE models.
Model provider
dr-housemd
Model tree
Base
Naphula/G4-Runic-Oarfish-26B-A4B-v1.2
Quantized
this model
Modalities
Input
Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information