Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model Details
| Property | Value |
|---|---|
| Author | Momin Aldahdouh |
| Base model | Qwen/Qwen3-0.6B (596M params, 1.4 GB) |
| Adapter size | ~39 MB (LoRA rank=16) |
| Task | Kernel fault diagnosis / corrective action suggestion |
| Architecture | LoRA over dense transformer |
| License | Apache 2.0 |
What it does
Given a structured kernel fault report (MominOS harness envelope format), the model:
- Identifies the fault type (page fault, GPF, stack overflow, etc.)
- Explains the root cause in plain terms
- Suggests a specific corrective action
Harness envelope format
markdown
[SYSTEM] MominOS kernel fault diagnostician. Analyze and suggest a fix.[FAULT] vector=14 (Page Fault) err=0x0006 rip=0x0000000000401234 cr2=0x0000000000000008 tid=3 cwd=/bin[REGISTERS] rax=0x0 rdi=0x8 rsi=0x100 rsp=0x7fff00100ff8[RECENT_SYSCALLS]SYS_OPEN /bin/sh 0 -> 3SYS_READ 3 4096 -> 4096[LOG][VFS] opened /bin/sh[SCHED] thread 3 running[QUERY] Diagnose this fault and suggest a corrective action.
How to Use
You need both the base model (1.4 GB) and this adapter (39 MB):
python
import torchfrom transformers import AutoModelForCausalLM, AutoTokenizerfrom peft import PeftModelbase_id = "Qwen/Qwen3-0.6B"adapter_id = "Momin-Aldahdouh/MominoMoE-v2"tokenizer = AutoTokenizer.from_pretrained(adapter_id)model = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype=torch.float16, device_map="auto")model = PeftModel.from_pretrained(model, adapter_id)model.eval()SYSTEM = ("You are MominOS, a kernel fault diagnostician running embedded in an x86-64 OS. ""Given a kernel fault report in harness envelope format, identify the fault type, ""root cause, and suggest a specific corrective action. Be concise and precise. /no_think")fault_prompt = """[FAULT] vector=14 (Page Fault) err=0x0006 rip=0x0000000000401234 cr2=0x0000000000000008 tid=3 cwd=/bin[REGISTERS] rax=0x0 rdi=0x8 rsi=0x100 rsp=0x7fff00100ff8[RECENT_SYSCALLS]SYS_OPEN /bin/sh 0 -> 3SYS_READ 3 4096 -> 4096[LOG][VFS] opened /bin/sh[SCHED] thread 3 running[QUERY] Diagnose this fault and suggest a corrective action."""messages = [{"role": "system", "content": SYSTEM},{"role": "user", "content": fault_prompt},]text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)inputs = tokenizer(text, return_tensors="pt").to(model.device)with torch.no_grad():out = model.generate(**inputs, max_new_tokens=200, temperature=0.1, do_sample=True)response = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)print(response)
Training
Data
10,000 synthetic kernel fault diagnosis samples generated by training/generate_kernel_data.py. Each sample is a harness envelope prompt paired with a structured diagnosis response.
Fault distribution:
| Fault Type | Weight |
|---|---|
| Null dereference | 20% |
| Bad pointer | 18% |
| General Protection Fault | 14% |
| Stack overflow | 12% |
| Use-after-free | 12% |
| Protection violation | 10% |
| Divide by zero | 7% |
| Invalid opcode | 5% |
| Double fault | 2% |
LoRA config
| Param | Value |
|---|---|
| Rank (r) | 16 |
| Alpha | 32 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Dropout | 0.05 |
| Trainable params | ~10M (1.69% of 596M) |
Training run
| Metric | Value |
|---|---|
| Steps | 5,000 |
| Effective batch size | 16 (4 per device × 4 grad accum) |
| Learning rate | 2e-4 (cosine schedule) |
| Hardware | NVIDIA L4 (23 GB VRAM), GCP g2-standard-8 |
| Duration | ~4h 22m |
| Final train loss | 0.3086 |
| Final eval loss | 0.2896 |
| Token accuracy | ~89.2% |
Example Output
Input: Page fault at cr2=0x8, err=0x6 (write, non-present, user mode)
MominoMoE-v2:
Fault type: Null pointer dereference (write). CR2=0x8 indicates a write to offset 8 of a NULL pointer — likely a struct field access on an uninitialized pointer. err=0x6 confirms: write access, page not present, user mode. Corrective action: validate the pointer before dereferencing in the caller at RIP=0x401234. Add a NULL check before the struct field write.
Predecessor
MominoMoE_1.2B — a 1.2B custom MoE transformer trained from scratch. It produced poor results on kernel prompts because the training data was generic web API trajectories rather than OS/kernel fault data.
Citation
bibtex
@misc{mominomoe2,author = {Momin Aldahdouh},title = {MominoMoE-v2: LoRA-finetuned kernel fault diagnostician for MominOS},year = {2026},url = {https://huggingface.co/Momin-Aldahdouh/MominoMoE-v2}}
Framework versions
- PEFT 0.19.1
Model provider
Momin-Aldahdouh
Model tree
Base
Qwen/Qwen3-0.6B
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information