Casual-Autopsy

G4-MeroMero-31B-StyleSwap

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

MeroMero-31B Style Swap

An expirimental tensor swap merge targeting only one tensor: lm_head.weight The merge consist of two models:

The base: zerofata/G4-MeroMero-31B
The donor: Gryphe/Gemma-4-31B-StyleTune

The theory behind this is that since Gryphe's tune touches what your typical fine-tune doesn't: meging through tensor swapping should be practically loseless. It's also possible that it would make for good merging fodder post-FTing, but that's a theory for someone more knowledgeable in training to dive into.

If you're interested in Gryphe's tuning method, I'd suggest reading the model card of the tune.

Model provider