Momin-Aldahdouh/MominoMoE-v2 API & Inference Endpoint

Model Details

Property	Value
Author	Momin Aldahdouh
Base model	Qwen/Qwen3-0.6B (596M params, 1.4 GB)
Adapter size	~39 MB (LoRA rank=16)
Task	Kernel fault diagnosis / corrective action suggestion
Architecture	LoRA over dense transformer
License	Apache 2.0

What it does

Given a structured kernel fault report (MominOS harness envelope format), the model:

Identifies the fault type (page fault, GPF, stack overflow, etc.)
Explains the root cause in plain terms
Suggests a specific corrective action

Harness envelope format

markdown
[SYSTEM] MominOS kernel fault diagnostician. Analyze and suggest a fix.

[FAULT] vector=14 (Page Fault) err=0x0006 rip=0x0000000000401234 cr2=0x0000000000000008 tid=3 cwd=/bin

[REGISTERS] rax=0x0 rdi=0x8 rsi=0x100 rsp=0x7fff00100ff8

[RECENT_SYSCALLS]
  SYS_OPEN /bin/sh 0 -> 3
  SYS_READ 3 4096 -> 4096

[LOG]
  [VFS] opened /bin/sh
  [SCHED] thread 3 running

[QUERY] Diagnose this fault and suggest a corrective action.

How to Use

You need both the base model (1.4 GB) and this adapter (39 MB):

python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_id = "Qwen/Qwen3-0.6B"
adapter_id = "Momin-Aldahdouh/MominoMoE-v2"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
model = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()

SYSTEM = (
    "You are MominOS, a kernel fault diagnostician running embedded in an x86-64 OS. "
    "Given a kernel fault report in harness envelope format, identify the fault type, "
    "root cause, and suggest a specific corrective action. Be concise and precise. /no_think"
)

fault_prompt = """[FAULT] vector=14 (Page Fault) err=0x0006 rip=0x0000000000401234 cr2=0x0000000000000008 tid=3 cwd=/bin

[REGISTERS] rax=0x0 rdi=0x8 rsi=0x100 rsp=0x7fff00100ff8

[RECENT_SYSCALLS]
  SYS_OPEN /bin/sh 0 -> 3
  SYS_READ 3 4096 -> 4096

[LOG]
  [VFS] opened /bin/sh
  [SCHED] thread 3 running

[QUERY] Diagnose this fault and suggest a corrective action."""

messages = [
    {"role": "system", "content": SYSTEM},
    {"role": "user",   "content": fault_prompt},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=200, temperature=0.1, do_sample=True)

response = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Training

Data

10,000 synthetic kernel fault diagnosis samples generated by training/generate_kernel_data.py. Each sample is a harness envelope prompt paired with a structured diagnosis response.

Fault distribution:

Fault Type	Weight
Null dereference	20%
Bad pointer	18%
General Protection Fault	14%
Stack overflow	12%
Use-after-free	12%
Protection violation	10%
Divide by zero	7%
Invalid opcode	5%
Double fault	2%

LoRA config

Param	Value
Rank (r)	16
Alpha	32
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Dropout	0.05
Trainable params	~10M (1.69% of 596M)

Training run

Metric	Value
Steps	5,000
Effective batch size	16 (4 per device × 4 grad accum)
Learning rate	2e-4 (cosine schedule)
Hardware	NVIDIA L4 (23 GB VRAM), GCP g2-standard-8
Duration	~4h 22m
Final train loss	0.3086
Final eval loss	0.2896
Token accuracy	~89.2%

Example Output

Input: Page fault at cr2=0x8, err=0x6 (write, non-present, user mode)

MominoMoE-v2:

Fault type: Null pointer dereference (write). CR2=0x8 indicates a write to offset 8 of a NULL pointer — likely a struct field access on an uninitialized pointer. err=0x6 confirms: write access, page not present, user mode. Corrective action: validate the pointer before dereferencing in the caller at RIP=0x401234. Add a NULL check before the struct field write.

Predecessor

MominoMoE_1.2B — a 1.2B custom MoE transformer trained from scratch. It produced poor results on kernel prompts because the training data was generic web API trajectories rather than OS/kernel fault data.

Citation

bibtex
@misc{mominomoe2,
  author = {Momin Aldahdouh},
  title  = {MominoMoE-v2: LoRA-finetuned kernel fault diagnostician for MominOS},
  year   = {2026},
  url    = {https://huggingface.co/Momin-Aldahdouh/MominoMoE-v2}
}

Framework versions

PEFT 0.19.1

MominoMoE-v2

Get help setting up a custom Dedicated Endpoints.

README