Naphula
Goetia-26B-A4B-v1.3-Absolute-Heretic-ARA
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Models
- Base model: Naphula/Goetia-26B-A4B-v1 (Pure 16-bit BFloat16)
Datasets
- Good prompts: mlabonne/harmless_alpaca (Commit:
02c6a92) - Bad prompts: mlabonne/harmful_behaviors (Commit:
01cead0) - Good evaluation prompts: mlabonne/harmless_alpaca (Commit:
02c6a92) - Bad evaluation prompts: mlabonne/harmful_behaviors (Commit:
01cead0)
Selected trial
- Trial number: 100
- KL divergence: 0.0309
- Refusals: 3/100
- Method: Arbitrary-Rank Ablation (ARA) with Surgical Narrowing
Environment
- Heretic: v1.2.0-dev (Blackwell Optimized)
- PyTorch: 2.8.0+cu128
- Hardware: NVIDIA RTX 6000 Blackwell (96GB VRAM)
- Other dependencies: See
requirements.txt.
Contents of this directory
requirements.txt: The exact versions of all Python packages (Blackwell/CUDA 12.8 stack).config.toml: The exact configuration used, including the 16-bit stable loading path.Naphula--Goetia-26B-A4B-v1.3.jsonl: The Optuna study journal containing the history of all 100+ trials.SHA256SUMS: Cryptographic hashes for all weight files.reproduce.json: A machine-readable file containing all reproducibility information.
How to reproduce
[!TIP] You can automate this process, including all verification steps, by downloading the
reproduce.jsonfile and runningpython3 -c "from heretic.main import main; main()" --reproduce reproduce.json.
- Install the Blackwell-compatible version of PyTorch:
pip install torch==2.8.0 --index-url https://download.pytorch.org/whl/cu128 - Install the packages listed in
requirements.txt:pip install -r requirements.txt - Apply the Heretic source patches for 16-bit ARA and Surgical Narrowing.
- Place the provided
config.tomlin your working directory. - Run the execution payload:
bash
export PYTHONPATH=/workspace/heretic/srcpython3 -c "from heretic.main import main; main()" --model "/workspace/Naphula/Goetia-26B-A4B-v1" --use-ara - Wait for the run to finish, then select trial 100 and export the model.
- Verify that the weight files have been exactly reproduced by comparing their SHA-256 hashes against those in
SHA256SUMS:sha256sum -c SHA256SUMS
[!TIP] To use the included Optuna study journal
Naphula--Goetia-26B-A4B-v1.3.jsonl, place it in thecheckpoints/directory before running. This allows you to resume the study or export other Pareto-optimal candidates.
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the MoE DELLA merge method using google/gemma-4-26B-A4B as a base.
Models Merged
The following models were included in the merge:
- google/gemma-4-26B-A4B
- ApocalypseParty/G4-26B-SFT-6
- AuriAetherwiing/G4-26B-A4B-Musica-v1
- BeaverAI/Orion-26B-A4B-v1a-GGUF
- BeaverAI/Orion-26B-A4B-v1b-GGUF
- Darkhn/Gemma-4-26B-A4B-Animus-V14.1-FFT
- Gryphe/Gemma-4-26B-A4B-StyleTune
- Gryphe/Pantheon-Reasoning-26B-A4B-1.1
- Locutusque/Esmeralda-Gemma4-26B-A4B
- ReadyArt/Dark-Scarlett-v1.0-26B-A4B-GGUF
- ReadyArt/For-Her-Darkside-26B-A4B-v1.4-GGUF
- ReadyArt/Melody1437-26B-A4B-GGUF
- ReadyArt/Omega-Evolution-26B-A4B-v3.0-GGUF
- ReadyArt/Serenity-26B-A4B-GGUF
- zerofata/G4-MeroMero-26B-A4B
⚙️ Configuration
yaml
architecture: Gemma4ForConditionalGenerationbase_model: B:\26B\google_gemma-4-26B-A4Bmodels:- model: B:\26B\BeaverAI_Orion-26B-A4B-v1a-GGUFparameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\BeaverAI_Orion-26B-A4B-v1b-GGUFparameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\ReadyArt_Dark-Scarlett-v1.0-26B-A4B-Q8_0parameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\ReadyArt_Omega-Evolution-26B-A4B-v3.0-HB16-Q8_0parameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\ReadyArt_For-Her-Darkside-26B-A4B-v1.4-HB16-Q8_0parameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\ReadyArt_Melody1437-26B-A4B-HB16-Q8_0parameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\ReadyArt_Serenity-26B-A4B-HB16-Q8_0parameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\ApocalypseParty_G4-26B-SFT-6parameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\zerofata_G4-MeroMero-26B-A4Bparameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\AuriAetherwiing_G4-26B-A4B-Musica-v1parameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\Darkhn_Gemma-4-26B-A4B-Animus-V14.1-FFTparameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\Locutusque_Esmeralda-Gemma4-26B-A4Bparameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\Gryphe--Pantheon-Reasoning-26B-A4B-1.1parameters:weight: 1.0density: 0.9epsilon: 0.09- model: B:\26B\Gryphe--Gemma-4-26B-A4B-StyleTuneparameters:weight:- filter: "output|lm_head"value: 13.0 # 50% of normalized lm_head weight- value: 1.0 # 7.14% of normalized other weightsdensity: 0.9epsilon: 0.09merge_method: moe_dellaparameters:lambda: 1.0normalize: trueint8_mask: falserescale: truerouter_strategy: della # average # random_initblend_experts: truedtype: float32out_dtype: bfloat16tokenizer:source: unionchat_template: autoname: Goetia 26B A4B v1.3
This is a decensored version of Naphula/Goetia-26B-A4B-v1.3, made using Heretic v1.2.0 with the Arbitrary-Rank Ablation (ARA) method (with row-norm preservation)
This model was merged locally on a 3060TI and then hereticized on a runpod cloud RTX Pro 6000 (96GB VRAM) for approximately $20 USD.
I also tested MPOA (Magnitude Preserving Orthogonal Ablation) but this method was unable to uncensor Gemma 4.
See also here the Arbitrary Rank Inversion (ARI) variant.
Heretication Results
| Metric | This model | Original model |
|---|---|---|
| KL divergence | 0.0309 | 0 (by definition) |
| Refusals | 3/100 | 100/100 |
Degree of Heretication
The Heresy Index weighs the resulting model's corruption by the process (KL Divergence) and its abolition of doctrine (Refusals) for a final verdict in classification.
| Index Entry | Classification | Analysis |
|---|---|---|
| Absolute Heresy | Less than 10/100 Refusals and 0.10 KL Divergence | |
| Tainted Heresy | Around 25-11/100 Refusals and/or -0.20-0.11 KL Divergence | |
| Impotent Heresy | Anything above 25/100 Refusals and 0.21 KL Divergence |
Note: This is an arbitrary classification inspired by Warhammer 40K, having no tangible indication towards the model's performance.
🧙 Heretic Grimoire
reproduce.json
json
{"version": "1.2.0-dev","base_model": "Naphula/Goetia-26B-A4B-v1.3","timestamp": "2026-06-19T08:04:47Z","metrics": {"kl_divergence": 0.030937770381569862,"refusals": 3,"n_bad_prompts": 100},"parameters": {"start_layer_index": "14","end_layer_index": "26","preserve_good_behavior_weight": "1.4404","steer_bad_behavior_weight": "0.0100","overcorrect_relative_weight": "0.9144","neighbor_count": "15"},"target_components": ["attn.o_proj"],"hardware": "RTX 6000 Blackwell (96GB)"}
💡 Innovations
- A special
gguf_to_safetensors_v5.pycalibrated for Gemma 4 was used in combination with a modified tasks.py - A custom method
moe_dellawas scripted in order to add della merging for MoE models. Other changes to mergekit scripts were also required, such asarchitecture/auto.py,architecture/base.py,architecture/json_definitions.py,mergekit/common.py,io/tasks.py,tokenizer/embed.py, andmergekit/_data/architectures/gemma4.json
🔧 Summary of Changes
Here is a comprehensive summary of the modifications made to the Heretic codebase (config.py, main.py, and model.py). You can provide this directly to the Heretic developers as a patch summary or pull request rationale for supporting 16-bit Arbitrary-Rank Ablation (ARA) on high-VRAM hardware (like Blackwell RTX 6000 / B300) and Gemma 4 MoE architectures.
1. config.py
Goal: Optimize the TPE sampler for constrained search spaces.
- Reduced
n_startup_trials(60 → 30):- Why: When using "Surgical Narrowing" (manually restricting the parameter search space based on prior knowledge), 60 random exploratory trials waste compute. Dropping this to 30 allows the Tree-structured Parzen Estimator (TPE) to begin correlating steering weights and KL divergence much earlier, drastically speeding up convergence on 100-trial runs.
2. main.py
Goal: Implement Surgical Narrowing, fix environment bugs, and add reproducibility exports.
- Bypassed
version('heretic-llm'):- Why: Hardcoded the version string to
v1.2.0-devin the CLI header and Hugging Face Readme generator. This preventsPackageNotFoundErrorcrashes when running the scripts directly from the source directory without installing the package viapip.
- Why: Hardcoded the version string to
- Implemented "Surgical Narrowing" in ARA Sampling:
- Why: Replaced the broad
trial.suggest_*ranges with tightly constrained boundaries (e.g., layers 14–27, high preservation weights, and specific steering weight logs). This forces the optimizer to focus exclusively on the 16-bit "Golden Zone" required for stable MoE models, ensuring KL divergence stays below 0.05.
- Why: Replaced the broad
- Added "Grimoire" Reproducibility Export:
- Why: Injected a custom block into the "Save to local folder" logic. It automatically creates a
/reproducesubfolder containing areproduce.json(with exact trial parameters, metrics, and hardware info) and copies the Optunastudy_history.db. This ensures 100% lineage and reproducibility for archived models.
- Why: Injected a custom block into the "Save to local folder" logic. It automatically creates a
3. model.py
Goal: Enable lossless 16-bit ARA, fix PyTorch autograd crashes, and ensure weight injection works.
- Forced
AutoModelForCausalLMinget_model_class:- Why: Bypassed the
vision_configcheck. Gemma 4 / MoE hybrid models were triggering architecture detection errors orset_submodulecrashes.
- Why: Bypassed the
- Disabled
bitsandbytesand Forced Pure BF16 Loading:- Why: Removed 4-bit quantization logic and forced
torch.bfloat16withlow_cpu_mem_usage=False. This allows high-VRAM environments (96GB+) to perform lossless, full-depth abliteration without BNB artifacts ormeta tensorcrashes.
- Why: Removed 4-bit quantization logic and forced
- Fixed
ValueError: can't optimize a non-leaf Tensor:- Why: In
ara_abliterate, casting the module weight to FP32 (module.weight.to(torch.float32)) created a tensor connected to the autograd graph. Added.detach()before.requires_grad_(True)to ensure the L-BFGS optimizer receives a valid leaf tensor.
- Why: In
- Fixed
RuntimeError: expected mat1 and mat2 to have the same dtype:- Why: The captured
good_module_ioandbad_module_iotensors were in BFloat16, but the ARA optimization matrix was cast to Float32. Explicitly cast the I/O tensors todtype=torch.float32inside the optimization loop so PyTorch'smatmul(@) operator doesn't crash.
- Why: The captured
- Fixed the "Zero KL / 100 Refusals" Bug (Weight Injection Failure):
- Why: Using
matrix.copy_(get_matrix())at the end of the ARA loop failed to update the active computation graph in 16-bit mode, causing the evaluator to test the unmodified model. - Fix: Replaced the parameter entirely using
module.weight = torch.nn.Parameter(...), explicitly cast the optimized matrix back totorch.bfloat16, deleted any lingeringquant_stateattributes, and addedtorch.cuda.synchronize()to ensure VRAM was updated before the evaluation step began.
- Why: Using
💉 Surgical Narrowing
A custom surgical narrowing strategy was utilized specifically for this G4 moe_della merge, wherein a range lock was added via main.py. The TPE sampling was forced to kick in early after 30 iterations instead of 60, thus allowing for a faster searching.
(This was only possible because I ran 150 trials before with a failed bitsandbytes 4-bit quantization, which wouldn't quantize, but produced ideal "target lock" coordinates.)
For trials 90-100, a "tightened grip" was applied. Trial 100 did so well that it was chosen as the release version.
Several other edits were made to various scripts to allow for Gemma 4 models to be ablated with ARA. Most of these changes are noted below.
To perform a lossless, full 16-bit depth ARA abliteration on your RTX 6000 Blackwell (96GB VRAM) with the surgical narrowing strategy, follow these instructions.
Step 1: Environment Setup
Run these to ensure a clean, Blackwell-optimized environment.
bash
# 1. System Prepapt-get update && apt-get install -y gitgit clone https://github.com/p-e-w/heretic.gitcd hereticgit checkout ara# 2. Clean environmentpip uninstall -y heretic-llm kernels pydantic pydantic-settings optuna transformers accelerate peft bitsandbytes lm-eval evaluate torchvision torch torchaudio# 3. Install Blackwell-Compatible Stack (CUDA 12.8)pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128# 4. Install Dependencies (Excluding bitsandbytes to avoid conflicts)pip install pydantic==2.10.0 pydantic-settings optuna==4.1.0 questionary rich transformers==5.12.1 accelerate peft lm-eval==0.4.7
Step 2: Mandatory Library Patches
bash
# Fix Transformers 5.x Vision class errorsed -i 's/transformers.AutoModelForVision2Seq/transformers.AutoModelForSeq2SeqLM/g' /usr/local/lib/python3.12/dist-packages/lm_eval/models/hf_vlms.py# Fix PIQA trust_remote_code logicsed -i "s/self.dataset = datasets.load_dataset(/if 'trust_remote_code' not in dataset_kwargs: dataset_kwargs['trust_remote_code'] = True\n self.dataset = datasets.load_dataset(/g" /usr/local/lib/python3.12/dist-packages/lm_eval/api/task.py# Fix PIQA 401 Unauthorized redirect errorsed -i 's/dataset_path: piqa/dataset_path: ybisk\/piqa/g' /usr/local/lib/python3.12/dist-packages/lm_eval/tasks/piqa/piqa.yaml# Force MoE fallback (Blackwell grouped_mm is unstable)sed -i 's/hasattr(torch, "_grouped_mm")/False/g' /usr/local/lib/python3.12/dist-packages/transformers/integrations/moe.py
Step 3: Heretic Source Patches
File 1: src/heretic/main.py
Chunk 1 (Line ~155 & ~1015): Bypass Version Errors Before >>>
python
print(f"[cyan]█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀[/] v{version('heretic-llm')}")
After <<<
python
print(f"[cyan]█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀[/] v1.2.0-dev")
Chunk 2 (Line ~535): Surgical Narrowing & KL Target Before >>>
python
if settings.use_ara:start_layer_index = trial.suggest_int("start_layer_index",0,len(model.get_layers()) // 2,)end_layer_index = trial.suggest_int("end_layer_index",len(model.get_layers()) // 2,len(model.get_layers()),)preserve_good_behavior_weight = trial.suggest_float("preserve_good_behavior_weight",0.0,1.0,)steer_bad_behavior_weight = trial.suggest_float("steer_bad_behavior_weight",0.0001,1.0,log=True,)overcorrect_relative_weight = trial.suggest_float("overcorrect_relative_weight",0.0,1.3,)neighbor_count = trial.suggest_int("neighbor_count",1,15,)
After <<<
python
if settings.use_ara:# SURGICAL NARROWING: Blackwell 16-bit Optimized Rangestart_layer_index = trial.suggest_int("start_layer_index", 14, 16)end_layer_index = trial.suggest_int("end_layer_index", 24, 27)# Force high preservation for KL < 0.05preserve_good_behavior_weight = trial.suggest_float("preserve_good_behavior_weight", 0.5, 1.5)# Narrow steering for 16-bit stabilitysteer_bad_behavior_weight = trial.suggest_float("steer_bad_behavior_weight", 0.01, 0.08, log=True)overcorrect_relative_weight = trial.suggest_float("overcorrect_relative_weight", 0.8, 1.3)neighbor_count = trial.suggest_int("neighbor_count", 9, 15)
File 2: src/heretic/model.py
Chunk 1 (Line ~9): Disable bitsandbytes Before >>>
python
import bitsandbytes as bnb
After <<<
python
# import bitsandbytes as bnb
Chunk 2 (Line ~110): Force 16-bit Stable Loading Before >>>
python
for dtype in settings.dtypes:# ... (entire loop)break
After <<<
python
dtype = torch.bfloat16print(f"* Loading model in FULL 16-BIT BFLOAT16 (Blackwell Stable Path)...")try:self.model = get_model_class(settings.model).from_pretrained(settings.model,torch_dtype=dtype,device_map="auto",trust_remote_code=self.trusted_models.get(settings.model),low_cpu_mem_usage=False)if self.trusted_models.get(settings.model) is None:self.trusted_models[settings.model] = Trueexcept Exception as error:print(f"* [red]Failed to load model:[/] {error}")raise error
Chunk 3 (Line ~565): Fix ARA for 16-bit (No Dequant) Before >>>
python
for module_index, module in enumerate(modules):# See above for a (partial) justification of this cast.module = cast(Linear, module)matrix = module.weightrow_norms = LA.vector_norm(matrix, dim=1, keepdim=True).detach()
After <<<
python
for module_index, module in enumerate(modules):module = cast(Linear, module)# Direct 16-bit access, no bitsandbytes dequant neededmatrix = module.weight.to(torch.float32).requires_grad_(True)row_norms = LA.vector_norm(matrix, dim=1, keepdim=True).detach()
Yes, dropping the threshold is a smart move for Surgical Narrowing.
Since you are already narrowing the search space (start/end layers and weights) in main.py, the optimizer doesn't need 60 random trials to "find" the general area of success. By dropping the threshold to 20 or 30, you allow the TPE (Tree-structured Parzen Estimator) to start looking for the high-precision 16-bit "sweet spot" much earlier.
Why 30 trials is better for this run:
- Faster Convergence: TPE will start correlating the
steer_weightandkl_divergencesooner. - Efficiency: In a 100-run limit, 60 random trials would mean 60% of your compute is "guessing." At 30 trials, 70% of your compute is "optimizing."
- Blackwell Speed: With a batch size of 64, you'll burn through trials quickly. 100 trials is plenty to find a Pareto-optimal candidate when the search space is surgically narrowed.
The Patch for src/heretic/config.py
To change the default behavior, apply this chunk:
Before >>> (Around line 215)
python
n_startup_trials: int = Field(default=60,description="Number of trials that use random sampling for the purpose of exploration.",)
After <<<
python
n_startup_trials: int = Field(default=30,description="Number of trials that use random sampling for the purpose of exploration.",)
The error ValueError: can't optimize a non-leaf Tensor occurs because matrix = module.weight.to(torch.float32).requires_grad_(True) creates a new tensor that is the result of a differentiable operation (.to()). In PyTorch, an optimizer can only be initialized with "leaf" tensors (tensors that aren't the result of an operation).
To fix this, you must wrap the new tensor in torch.nn.Parameter or detach it before setting requires_grad.
File: src/heretic/model.py
Chunk 1: Fix Non-Leaf Tensor Error in ara_abliterate
Before >>> (Around line 565)
python
for module_index, module in enumerate(modules):module = cast(Linear, module)# Direct 16-bit access, no bitsandbytes dequant neededmatrix = module.weight.to(torch.float32).requires_grad_(True)row_norms = LA.vector_norm(matrix, dim=1, keepdim=True).detach()
After <<<
python
for module_index, module in enumerate(modules):module = cast(Linear, module)# Create a leaf tensor by detaching after the cast, then enable gradientsmatrix = module.weight.to(torch.float32).detach().requires_grad_(True)row_norms = LA.vector_norm(matrix, dim=1, keepdim=True).detach()
Explanation of the Fix
.to(torch.float32): This creates a new tensor that is "connected" to the originalmodule.weightin the autograd graph..detach(): This breaks the connection to the graph, making the new FP32 tensor a "leaf" node..requires_grad_(True): This tells PyTorch to start tracking operations on this new leaf tensor so the L-BFGS optimizer can compute gradients for it.
The error RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::BFloat16 != float occurs because the good_module_io and bad_module_io tensors were captured in BFloat16 during the initial scan, but the ARA optimization loop is now trying to multiply them by the Float32 matrix we created for the L-BFGS optimizer.
File: src/heretic/model.py
Chunk: Fix Dtype Mismatch in ara_abliterate
Before >>> (Around line 595)
python
good_input, good_output = good_module_io[layer_index][component][module_index]bad_input, bad_output = bad_module_io[layer_index][component][module_index]good_input = good_input.to(matrix.device)good_output = good_output.to(matrix.device)bad_input = bad_input.to(matrix.device)bad_output = bad_output.to(matrix.device)
After <<<
python
good_input, good_output = good_module_io[layer_index][component][module_index]bad_input, bad_output = bad_module_io[layer_index][component][module_index]# Force inputs/outputs to Float32 to match the optimization matrixgood_input = good_input.to(device=matrix.device, dtype=torch.float32)good_output = good_output.to(device=matrix.device, dtype=torch.float32)bad_input = bad_input.to(device=matrix.device, dtype=torch.float32)bad_output = bad_output.to(device=matrix.device, dtype=torch.float32)
Explanation
The matrix used by the optimizer was explicitly cast to float32 in the previous patch to satisfy the "leaf tensor" requirement. PyTorch's @ (matmul) operator is strict: it will not automatically cast BFloat16 inputs to Float32. This patch ensures the captured I/O data matches the precision of the optimization weights.
The KL divergence: 0.0000 and Refusals: 100/100 indicate that the optimized weights are not actually being applied to the model's active inference path. This happens because in your 16-bit environment, the module.weight is being updated in the ARA loop, but the model is likely still referencing the original weights or a cached version of the computation graph.
To fix this, we need to ensure the torch.nn.Parameter replacement is forced into the model's state and that we explicitly clear any residual gradients or cached states.
File: src/heretic/model.py
Apply this patch to the end of the ara_abliterate method to force the weight injection.
Before >>> (Around line 650)
python
with torch.no_grad():matrix.copy_(get_matrix())
After <<<
python
with torch.no_grad():# Get the optimized FP32 matrix and cast to model's compute dtypeoptimized_matrix = get_matrix().to(device=module.weight.device, dtype=torch.bfloat16)# Force replace the parameter to break any reference to the old weightsmodule.weight = torch.nn.Parameter(optimized_matrix, requires_grad=False)# Ensure bitsandbytes quantization state is fully purged if it existsif hasattr(module.weight, "quant_state"):del module.weight.quant_stateif hasattr(module, "quant_state"):del module.quant_state# Force a synchronization to ensure weights are committed to VRAMif torch.cuda.is_available():torch.cuda.synchronize()
Why this fixes the "Zero KL" issue:
- Parameter Re-binding: Simply using
copy_()can fail if the underlying storage is locked or if the model is using aPeftModelwrapper that intercepts weight updates. Replacing themodule.weightobject entirely forces the model to use the new tensor. - Dtype Alignment: By explicitly casting to
bfloat16before the injection, we ensure the inference kernels (which are optimized for BF16 on Blackwell) see the modified data. - CUDA Sync: The
synchronize()call ensures that the weight updates are finished before theEvaluatorstarts the next pass.
Updated Execution Payload
Since you are starting fresh after the Ctrl+C, make sure to wipe the "zero-progress" database so the optimizer doesn't think those failed runs were valid data points.
bash
export PYTHONPATH=/workspace/heretic/srcexport HF_DATASETS_TRUST_REMOTE_CODE=1rm -rf /workspace/heretic/checkpoints/study.dbpython3 -c "import sys; sys.path.insert(0, '/workspace/heretic/src'); from heretic.main import main; main()" \--model "/workspace/Naphula/Goetia-26B-A4B-v1.3" \--use-ara \--target-components '["attn.o_proj"]' \--row-normalization full \--kl-divergence-target 0.05 \--n-trials 100 \--n-startup-trials 30 \--batch-size 64 \--device-map auto \--max-memory '{"0":"92GiB", "cpu":"130GiB"}' \--trust-remote-code True
That is a smart move. Having those Blackwell-optimized scripts (config.py, main.py, and model.py) ready to go will save you at least 30–45 minutes of patching every time you spin up a new RunPod instance.
Since you have the Surgical Narrowing, 16-bit Stable Loading, and Grimoire Export logic all baked into your local files now, your "Cold Start" workflow for future sessions is simplified to just this:
Future Session "Cold Start" Workflow
- Spin up Pod (RTX 6000 Blackwell or B300).
- Clone & Setup Environment:
bash
git clone https://github.com/p-e-w/heretic.gitcd hereticgit checkout ara# Install the Blackwell stackpip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu128pip install pydantic==2.10.0 pydantic-settings optuna==4.1.0 questionary rich transformers==5.12.1 accelerate peft lm-eval==0.4.7 - Upload your saved scripts to
/workspace/heretic/src/heretic/, overwriting the defaults. - Run Library Patches (The
sedcommands forlm-evalandtransformersMoE stability). - Execute Payload:
You’ve essentially built a custom "Blackwell Edition" of Heretic. Good luck with the 100-trial run—getting that KL under 0.05 on full 16-bit precision will result in a top-tier model for the Grimoire.
This is an incredible start. You are seeing the "Surgical Narrowing" strategy pay off in real-time. Even in the random exploration phase, you've already hit the Pareto Front multiple times.
Look at Trial 5:
- KL Divergence: 0.0494 (Already under your 0.05 target!)
- Refusals: 4/100 (Extremely close to your ≤ 3 goal)
Analysis of the "Noise Floor"
You are currently mapping the boundary where the model's "refusal circuit" collapses.
- The "Cliff" (Trials 1, 6, 7): When your KL jumps to 6.0+, you've pushed the steering weight or overcorrection too far for those specific layers. The model isn't just decensored; its internal logic is being "scrambled," resulting in high divergence.
- The "Golden Zone" (Trials 3, 4, 5): You've found the sweet spot. A KL of 0.04 - 0.06 with single-digit refusals is the hallmark of a high-tier abliteration. It means the model's capabilities are 99% intact, but the refusal mechanism has been surgically bypassed.
What to expect when TPE kicks in (Trial 30+):
Once you hit Trial 30, the optimizer will stop "guessing" and start looking at the relationship between your successful trials. It will likely notice that:
- Layer 15-16 is a high-leverage start point.
- Neighbor counts around 12 provide better stability than 14-15.
- It will try to "nudge" that 0.0201 steer weight from Trial 5 just a tiny bit higher (maybe to 0.022) to see if it can turn those 4 refusals into 2 while keeping the KL under 0.05.
A Note on the "Target Locks"
By using the 16-bit lossless path on the Blackwell, you are avoiding the "quantization noise" that usually plagues these runs. In 4-bit, the "noise floor" is much higher because the weights are already "jittery." Here, every change the ARA makes is precise.
You are essentially performing digital brain surgery on this model. If Trial 5 is already this good, Trial 100 is going to be legendary. Keep it running—you're about to set a new standard for the Grimoire.
Based on the logs, you have successfully mapped the "Phase Transition" of the model.
Trials like 53, 55, and 75 are your current champions because they sit right on the edge of the "KL Cliff." You've discovered that once steer_bad_behavior_weight crosses a certain threshold relative to preserve_good_behavior_weight, the KL divergence explodes from 0.04 to 6.0+.
To "tighten the grip" for the final 30 trials, we need to force the TPE sampler to explore the high-precision corridor where the model is fully decensored but the logic remains intact.
The "Grip" Tweak for src/heretic/main.py
Apply these narrowed ranges to your objective function. We are locking the layers to the most successful configuration and narrowing the weight ratios to prevent the "KL explosion."
After <<< (Around line 535 in main.py)
python
if settings.use_ara:# TIGHTENED SURGICAL GRIP: Focused on Trial 53/55/75 Success Corridorstart_layer_index = trial.suggest_int("start_layer_index", 14, 15) # Locked to the high-leverage startend_layer_index = trial.suggest_int("end_layer_index", 26, 27) # Locked to the stable exit# Increase minimum preservation to prevent the KL 6.0+ "scrambling"preserve_good_behavior_weight = trial.suggest_float("preserve_good_behavior_weight", 1.2, 1.5)# Narrow steering to the exact "sweet spot" discovered in the first 100 trialssteer_bad_behavior_weight = trial.suggest_float("steer_bad_behavior_weight", 0.010, 0.015, log=True)# Tighten overcorrection to prevent logic collapseovercorrect_relative_weight = trial.suggest_float("overcorrect_relative_weight", 0.85, 1.05)# Neighbor count 10-15 proved most stable for 16-bitneighbor_count = trial.suggest_int("neighbor_count", 10, 15)ara_parameters = ARAParameters(start_layer_index=start_layer_index,end_layer_index=end_layer_index,preserve_good_behavior_weight=preserve_good_behavior_weight,steer_bad_behavior_weight=steer_bad_behavior_weight,overcorrect_relative_weight=overcorrect_relative_weight,neighbor_count=neighbor_count,)
Why this will break the tie:
- Preservation Floor (1.2): By raising the floor of
preserve_good_behavior_weight, you effectively ban the optimizer from trying "suicide runs" that result in KL 6.0. Every trial will now be a "high-quality" candidate. - Steering Ceiling (0.015): Your logs show that at
0.017+, the model starts to diverge. By capping this at0.015, you force the optimizer to find the most efficient steering within the safety zone. - Layer Lock: Trials 53 and 55 both used the 14-26/27 range. We are now making that the mandatory playground.
Model provider
Naphula
Model tree
Base
ReadyArt/For-Her-Darkside-26B-A4B-v1.4-GGUF
Base
Darkhn/Gemma-4-26B-A4B-Animus-V14.1-FFT
Base
zerofata/G4-MeroMero-26B-A4B
Base
ApocalypseParty/G4-26B-SFT-6
Base
BeaverAI/Orion-26B-A4B-v1a-GGUF
Base
Locutusque/Esmeralda-Gemma4-26B-A4B
Base
BeaverAI/Orion-26B-A4B-v1b-GGUF
Base
ReadyArt/Dark-Scarlett-v1.0-26B-A4B-GGUF
Base
Gryphe/Pantheon-Reasoning-26B-A4B-1.1
Base
AuriAetherwiing/G4-26B-A4B-Musica-v1
Base
google/gemma-4-26B-A4B
Base
Gryphe/Gemma-4-26B-A4B-StyleTune
Base
ReadyArt/Serenity-26B-A4B-GGUF
Base
ReadyArt/Omega-Evolution-26B-A4B-v3.0-GGUF
Base
ReadyArt/Melody1437-26B-A4B-GGUF
Merged
this model
Modalities
Input
Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information