Keowu/monare-re-qwen25-coder-7b API & Inference Endpoint

Inspiration

This work is inspired by the ReCopilot research paper:

text
ReCopilot: Reverse Engineering Copilot in Binary Analysis
arXiv: 2505.16366
https://arxiv.org/abs/2505.16366

This model is not an official ReCopilot release and is not affiliated with the authors of that paper. It follows the same broad idea: training a model on source-code-to-stripped-binary-to-decompiler examples instead of relying only on prompt engineering.

Training Data

The initial training dataset was built from open-source C/C++ projects. The pipeline used:

text
open-source C/C++ source
-> debug build with DWARF/symbols
-> stripped binary
-> IDA/Hex-Rays pseudocode export
-> ground-truth extraction from debug metadata
-> supervised fine-tuning JSONL

Current training snapshot:

225 debug/stripped binary artifacts
3,262 ground-truth function records
225 IDA/Hex-Rays exports
3,256 final SFT examples
training split: 2,758 examples
validation split: 346 examples
test split: 152 examples

The dataset focuses on function names, argument names, argument types, and strict JSON output formatting. Struct recovery, data-flow, and richer cross-function context are planned future improvements.

Output Format

The model is trained to return only valid JSON. A typical output:

json
{
  "function": {
    "ea": "0x401230",
    "old_name": "sub_401230",
    "suggested_name": "aes_cbc_encrypt_buffer",
    "confidence": 0.88,
    "reason": "Matched stripped decompiler function to debug-symbol ground truth at the same address."
  },
  "arguments": [
    {
      "old_name": "a1",
      "new_name": "ctx",
      "type": "struct AES_ctx *",
      "confidence": 0.82
    }
  ],
  "locals": [],
  "structs": [],
  "comments": [],
  "warnings": []
}

Intended Use

Use this model for research and assisted reverse engineering tasks such as:

suggesting function names from decompiler pseudocode;
recovering argument names and approximate C types;
generating conservative JSON semantic patches;
bootstrapping review workflows for stripped C/C++ binaries.

All suggestions should be reviewed by a human reverse engineer before being applied to an analysis database.

Limitations

This is an early experimental model.

It was trained for only a short QLoRA run.
It may emit invalid JSON on long or truncated prompts.
It can confuse similar cryptographic or compression routines.
It should not be used for malware-family attribution.
It should not be treated as a source of truth.

Prompt Template

text
<TASK>recover_semantic_patch</TASK>
<TARGET>
EA: 0x401230
Name: sub_401230
Pseudocode:
...
</TARGET>
<EVIDENCE>
Strings: [...]
Callees: [...]
Callers: [...]
Imports: [...]
Offset accesses: [...]
Data flow: [...]
</EVIDENCE>
<SCHEMA>
{"function":{"ea":"0x401230","old_name":"sub_401230","suggested_name":"","confidence":0.0,"reason":""},"arguments":[],"locals":[],"structs":[],"comments":[],"warnings":[]}
</SCHEMA>
Return only valid JSON. Be conservative. Do not invent facts.

Loading the LoRA Adapter

python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

base = "Qwen/Qwen2.5-Coder-7B-Instruct"
adapter = "Keowu/monare-re-qwen25-coder-7b"

quant_config = BitsAndBytesConfig(load_in_4bit=True)

tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)

model = AutoModelForCausalLM.from_pretrained(
    base,
    device_map="auto",
    quantization_config=quant_config,
    trust_remote_code=True,
)

model = PeftModel.from_pretrained(model, adapter)
model.eval()

import torch

messages = [
      {
          "role": "system",
          "content": """You are a reverse engineering assistant specialized in IDA
          Hex-Rays pseudocode. Return only valid JSON. Be conservative."""
      },
      {
          "role": "user",
          "content": """        
  <TASK>recover_semantic_patch</TASK>
  <TARGET>
  EA: 0x401230
  Name: sub_401230
  Pseudocode:
  int __fastcall sub_401230(__int64 a1, char *a2, unsigned int a3)
  {
    FILE *v3;
    int result;

    v3 = fopen(a2, "rb");
    if (!v3)
      return -1;
    result = fread((void *)(a1 + 32), 1u, a3, v3);
    fclose(v3);
    *(_DWORD *)(a1 + 16) = result;
    return result;
  }
  </TARGET>
  <EVIDENCE>
  Strings: ["rb"]
  Callees: [{"name":"fopen"},{"name":"fread"},{"name":"fclose"}]
  Callers: []
  Imports: ["fopen","fread","fclose"]
  Offset accesses: ["a1 + 32","a1 + 16"]
  Data flow: []
  </EVIDENCE>
  <SCHEMA>
  {"function":
  {"ea":"0x401230","old_name":"sub_401230","suggested_name":"","confidence":0.0,
  "reason":""},"arguments":[],"locals":[],"structs":[],"comments":[],"warnings":
  []}
  </SCHEMA>
  Return only valid JSON. Be conservative. Do not invent facts."""
      }
  ]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

with torch.no_grad():
    outputs = model.generate(
        input_ids=inputs,
        max_new_tokens=512,
        temperature=0.1,
        do_sample=False,
        eos_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))

monare-re-qwen25-coder-7b

Get help setting up a custom Dedicated Endpoints.

README