Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Model Details

PropertyValue
AuthorMomin Aldahdouh
Base modelQwen/Qwen3-0.6B (596M params, 1.4 GB)
Adapter size~39 MB (LoRA rank=16)
TaskKernel fault diagnosis / corrective action suggestion
ArchitectureLoRA over dense transformer
LicenseApache 2.0

What it does

Given a structured kernel fault report (MominOS harness envelope format), the model:

  1. Identifies the fault type (page fault, GPF, stack overflow, etc.)
  2. Explains the root cause in plain terms
  3. Suggests a specific corrective action

Harness envelope format

markdown

[SYSTEM] MominOS kernel fault diagnostician. Analyze and suggest a fix.
[FAULT] vector=14 (Page Fault) err=0x0006 rip=0x0000000000401234 cr2=0x0000000000000008 tid=3 cwd=/bin
[REGISTERS] rax=0x0 rdi=0x8 rsi=0x100 rsp=0x7fff00100ff8
[RECENT_SYSCALLS]
SYS_OPEN /bin/sh 0 -> 3
SYS_READ 3 4096 -> 4096
[LOG]
[VFS] opened /bin/sh
[SCHED] thread 3 running
[QUERY] Diagnose this fault and suggest a corrective action.

How to Use

You need both the base model (1.4 GB) and this adapter (39 MB):

python

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_id = "Qwen/Qwen3-0.6B"
adapter_id = "Momin-Aldahdouh/MominoMoE-v2"
tokenizer = AutoTokenizer.from_pretrained(adapter_id)
model = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()
SYSTEM = (
"You are MominOS, a kernel fault diagnostician running embedded in an x86-64 OS. "
"Given a kernel fault report in harness envelope format, identify the fault type, "
"root cause, and suggest a specific corrective action. Be concise and precise. /no_think"
)
fault_prompt = """[FAULT] vector=14 (Page Fault) err=0x0006 rip=0x0000000000401234 cr2=0x0000000000000008 tid=3 cwd=/bin
[REGISTERS] rax=0x0 rdi=0x8 rsi=0x100 rsp=0x7fff00100ff8
[RECENT_SYSCALLS]
SYS_OPEN /bin/sh 0 -> 3
SYS_READ 3 4096 -> 4096
[LOG]
[VFS] opened /bin/sh
[SCHED] thread 3 running
[QUERY] Diagnose this fault and suggest a corrective action."""
messages = [
{"role": "system", "content": SYSTEM},
{"role": "user", "content": fault_prompt},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=200, temperature=0.1, do_sample=True)
response = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Training

Data

10,000 synthetic kernel fault diagnosis samples generated by training/generate_kernel_data.py. Each sample is a harness envelope prompt paired with a structured diagnosis response.

Fault distribution:

Fault TypeWeight
Null dereference20%
Bad pointer18%
General Protection Fault14%
Stack overflow12%
Use-after-free12%
Protection violation10%
Divide by zero7%
Invalid opcode5%
Double fault2%

LoRA config

ParamValue
Rank (r)16
Alpha32
Target modulesq_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Dropout0.05
Trainable params~10M (1.69% of 596M)

Training run

MetricValue
Steps5,000
Effective batch size16 (4 per device × 4 grad accum)
Learning rate2e-4 (cosine schedule)
HardwareNVIDIA L4 (23 GB VRAM), GCP g2-standard-8
Duration~4h 22m
Final train loss0.3086
Final eval loss0.2896
Token accuracy~89.2%

Example Output

Input: Page fault at cr2=0x8, err=0x6 (write, non-present, user mode)

MominoMoE-v2:

Fault type: Null pointer dereference (write). CR2=0x8 indicates a write to offset 8 of a NULL pointer — likely a struct field access on an uninitialized pointer. err=0x6 confirms: write access, page not present, user mode. Corrective action: validate the pointer before dereferencing in the caller at RIP=0x401234. Add a NULL check before the struct field write.


Predecessor

MominoMoE_1.2B — a 1.2B custom MoE transformer trained from scratch. It produced poor results on kernel prompts because the training data was generic web API trajectories rather than OS/kernel fault data.


Citation

bibtex

@misc{mominomoe2,
author = {Momin Aldahdouh},
title = {MominoMoE-v2: LoRA-finetuned kernel fault diagnostician for MominOS},
year = {2026},
url = {https://huggingface.co/Momin-Aldahdouh/MominoMoE-v2}
}

Framework versions

  • PEFT 0.19.1

Model provider

Momin-Aldahdouh

Model tree

Base

Qwen/Qwen3-0.6B

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today