naazimsnh02

fraudsentinel-qwen3-14b-lora

README

License: apache-2.0

Capabilities

The model is trained to act as an enterprise fraud and AML investigation assistant across six task types:

Structured JSON risk scoring — calibrated risk score (0.0–1.0), risk level (LOW / MEDIUM / HIGH / CRITICAL), typology, key signals, feature importance, recommended action, and SAR rationale
Explainable alerts — evidence-grounded investigator-facing natural language explanations tied to actual transaction features
Typology classification — primary and secondary fraud/laundering pattern identification (card-not-present, account takeover, fan-out, gather-scatter, structuring, etc.)
6-level recommended action — AUTO_APPROVE → APPROVE_WITH_MONITORING → STEP_UP_AUTH → TEMPORARY_HOLD → AUTO_BLOCK → SAR_REVIEW
SAR drafting — FinCEN-aligned Suspicious Activity Report narrative generation for human review and filing
Multi-turn HITL dialogue — investigator follow-ups ("Why this risk level?", "What else should I check?", "Customer confirmed legit — what next?")
Deep Analysis mode — optional Chain-of-Thought reasoning via Qwen3's thinking tokens for complex multi-account cases

Training Details

Table with columns: Property, Value
Property	Value
Base model	`unsloth/Qwen3-14B` (Apache-2.0)
Method	Supervised Fine-Tuning (SFT) + LoRA
LoRA rank	16
LoRA alpha	32
Target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj (all-linear)
LoRA dropout	0 (Unsloth-optimized)
Trainable parameters	64,225,280 (0.433% of 14.83B total)

Usage

Load with Unsloth (recommended)

python
from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "naazimsnh02/fraudsentinel-qwen3-14b-lora",
    max_seq_length = 4096,
    dtype = torch.bfloat16,
    load_in_4bit = False,
)
FastLanguageModel.for_inference(model)

Load with PEFT + Transformers

python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-14B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "naazimsnh02/fraudsentinel-qwen3-14b-lora")
tokenizer = AutoTokenizer.from_pretrained("naazimsnh02/fraudsentinel-qwen3-14b-lora")

Inference Example

python
messages = [
    {"role": "system", "content": "You are FraudSentinel, an expert fraud detection and AML investigation assistant."},
    {"role": "user", "content": (
        "Analyze this card transaction and return a structured JSON risk assessment.\n\n"
        "Transaction: amount=$828.62, category=misc_net, hour=2, "
        "amount_vs_category_p95=2.16x, tx_24h=4, geo_km=1847, is_fraud=True"
    )},
]

# Thinking mode OFF (fast mode — default for Tier-2 triage)
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.1,
        do_sample=True,
    )
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Deep Analysis mode (Chain-of-Thought for complex cases):

python
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True,   # activates Qwen3 thinking tokens
)

Output Schema (Structured Task)

json
{
  "risk_score": 0.84,
  "risk_level": "HIGH",
  "conclusion": "FRAUDULENT",
  "primary_typology": "card-not-present account takeover / stolen-card online cash-out",
  "secondary_typology": "account_takeover",
  "key_signals": [
    "amount_exceeds_category_p95",
    "high_risk_merchant_category",
    "unusual_hour_activity"
  ],
  "explanation": "Transaction amount $828.62 exceeds the 95th-percentile for misc_net purchases...",
  "feature_importance": {
    "amount_exceeds_category_p95": 0.46,
    "high_risk_merchant_category": 0.28,
    "unusual_hour_activity": 0.26
  },
  "recommended_action": "AUTO_BLOCK",
  "sar_required": false,
  "sar_rationale": null
}

Limitations

Prototype/research use. Source data is synthetic/semi-synthetic. Do not use for real customer adjudication without independent validation, bias review, and human-in-the-loop controls.
AI-generated SAR drafts require human review and edit before filing.
The model was trained with thinking mode OFF (enable_thinking=False). Enabling thinking mode at inference activates Qwen3's CoT capabilities but adds latency (3–5 s per response).
Feature importance values are deterministic heuristics from the training data generation pipeline, not SHAP or model-derived importances.

License

Apache-2.0 (base model and adapter).

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Model Details

Model Provider

naazimsnh02

Model Tree

Base

Qwen/Qwen3-14B

Adapter

this model

Input Modalities

Text

Output Modalities