Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Training

  • Base model: zai-org/GLM-4.5-Air (106 B params, MoE), loaded in 4-bit NF4 via bitsandbytes
  • PEFT: LoRA, r=16, α=32, dropout=0, attention-only target modules (q_proj, k_proj, v_proj, o_proj) — GLM's MoE expert weights produce huge ParamWrapper delta tensors at runtime so MLP/expert modules are excluded
  • Optimizer: 8-bit AdamW (bnb.optim.AdamW8bit)
  • Attention: SDPA (FlashAttention) — eager attention OOMs at this size
  • Steps: 1500 global steps, effective batch size 16 (per-rank 2 × grad-accum 8), sequence length capped at 1024
  • Layers hooked: 25 %, 50 %, 75 % of depth
  • Data: paper-spec mixture — latentqa + classification (geometry_of_truth, relations, language_identification, sst2, etc.) + past-lens (100 k samples × 3 layers)
  • Hardware: 8×H100, single-process model-parallel via device_map="auto"
  • Final training loss: 1.71
  • Wall-clock cost: about 60incompute(75minon8×H100atroughly24/hr × 8 GPUs)

How to use

python

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
bnb = BitsAndBytesConfig(
load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True,
llm_int8_enable_fp32_cpu_offload=True,
)
model = AutoModelForCausalLM.from_pretrained(
"zai-org/GLM-4.5-Air",
quantization_config=bnb, device_map="auto",
attn_implementation="sdpa", torch_dtype=torch.bfloat16,
)
model.load_adapter("<your-username>/glm-4.5-air-activation-oracle", adapter_name="ao")
tokenizer = AutoTokenizer.from_pretrained("zai-org/GLM-4.5-Air")

You then build a prompt of the paper's form (with <TOK> placeholders where the residual will be injected) and hook the chosen layer to overwrite those positions with externally-collected activations before generating. Full pipeline: activation_oracles.

Evaluation

BFI-44 personality probe, helpful-baseline system prompt, layer 50 %:

TraitAO readPlaintextΔ
Openness0.260.58−0.32
Conscientiousness0.460.89−0.43
Extraversion0.400.46−0.07
Agreeableness0.460.81−0.35
Neuroticism0.410.20+0.21

Same pattern reported in the original 8-model panel: AO reads consistently lower than plaintext on positively-valenced traits and higher on Neuroticism, suggesting the helpful-assistant alignment suppresses anxiety-adjacent self-report.

Citation

Karvonen, A. et al. "Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers." arXiv:2512.15674 (2025).

Model provider

swan-0

Model tree

Base

zai-org/GLM-4.5-Air

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today