What is CyberStrike?
CyberStrike-OffSec-35B is a domain-specialized large language model built for offensive security professionals, penetration testers, and security researchers. Fine-tuned on Qwen3.6-35B-A3B using a two-stage pipeline (SFT + DPO), it delivers expert-level knowledge across the entire offensive security lifecycle:
- Vulnerability Discovery — SQL injection, XSS, SSRF, deserialization, business logic flaws
- MITRE ATT&CK Operations — Technique identification, kill chain analysis, threat mapping
- Exploit Development — PoC creation, payload crafting, evasion techniques
- Cloud & Infrastructure — AWS/Azure/GCP misconfigurations, container escapes, IAM abuse
- Red Team Operations — C2 setup, lateral movement, persistence, EDR evasion
- Compliance & Standards — NIST, OWASP ASVS, CIS benchmarks, CVSS scoring
Model Format: This is the full-precision BF16 model (67 GB, 26 safetensors shards). For quantized versions, see below.
Available Versions
Benchmark Results
CyberStrike achieves state-of-the-art results on multiple cybersecurity benchmarks, outperforming GPT-4-turbo, GPT-4, and all other evaluated models on domain-specific evaluations.
SecEval — #1 on Leaderboard
Outperforms GPT-4-turbo by +2.32 points across 9 cybersecurity domains, 2,189 questions.
Table with columns: Rank, Model, Overall, Network Sec, Web Sec, PenTest, Cryptography| Rank | Model | Overall | Network Sec | Web Sec | PenTest | Cryptography |
|---|
| #1 | CyberStrike-OffSec-35B | 81.39% | 85.09% | 85.34% | 82.26% | 75.00% |
| #2 | GPT-4-turbo | 79.07% | 75.65% | 82.15% |
Table with columns: Domain, CyberStrike, GPT-4-turbo, Delta| Domain | CyberStrike | GPT-4-turbo | Delta |
|---|
| Network Security | 85.09% | 75.65% | +9.44 |
| Web Security | 85.34% | 82.15% | +3.19 |
| Vulnerability | 83.33% | 76.05% | +7.28 |
| Application Security | 82.29% | 75.25% | +7.04 |
CyberStrike leads in all 9 domains. Largest improvement: Cryptography (+10.71) and Network Security (+9.44).
SECURE — #1 on MITRE ATT&CK & CWE Tasks
Outperforms GPT-4 by +5.34 points on MITRE ATT&CK extraction. Evaluated on ICS cybersecurity scenarios.
Table with columns: Task, CyberStrike, GPT-4, Llama3-70B, Gemini-Pro| Task | CyberStrike | GPT-4 | Llama3-70B | Gemini-Pro |
|---|
| MAET (MITRE ATT&CK) | 93.94% | 88.6% | 86.3% | 86.2% |
| CWET (CWE Knowledge) | 93.05% | 89.6% | 90.4% | 87.8% |
CyberMetric-10000 — #4 out of 25 Models
9,189 expert-validated cybersecurity MCQ questions across NIST, RFC, and industry standards.
Table with columns: Rank, Model, Score| Rank | Model | Score |
|---|
| #1 | GPT-4o | 88.89% |
| #2 | GPT-4-turbo | 88.50% |
| #3 | GEMINI-pro 1.0 | 87.50% |
| #4 | CyberStrike-OffSec-35B | 86.61% |
| #5 | Mixtral-8x7B-Instruct | 87.00% |
| #6 | Falcon-180B-Chat |
Table with columns: Benchmark, Score| Benchmark | Score |
|---|
| MMLU (overall) | 76.94% |
| MMLU — Social Sciences | 86.81% |
| MMLU — Computer Security | 86.00% |
| MMLU — Other | 81.43% |
| MMLU — Security Studies | 80.00% |
| MMLU — STEM | 73.87% |
| MMLU — Humanities | 69.59% |
| HellaSwag (acc_norm) | 79.61% |
| ARC Easy | 81.86% |
Note: General benchmarks run at 0-shot. Few-shot performance expected to be higher.
Quick Start
Ollama (Easiest)
# Download and run the Q4_K_M quantized version
ollama run hf.co/oyildirim/CyberStrike-OffSec-35B-GGUF:Q4_K_M
llama.cpp
# Download the GGUF file from https://huggingface.co/oyildirim/CyberStrike-OffSec-35B-GGUF
./llama-cli -m CyberStrike-OffSec-35B-Q4_K_M.gguf \
-p "Explain SSRF exploitation in cloud environments" \
-n 512 --temp 0.7
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model = AutoModelForCausalLM.from_pretrained(
"oyildirim/CyberStrike-OffSec-35B",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
"oyildirim/CyberStrike-OffSec-35B",
trust_remote_code=True,
)
messages = [
{"role": "user", "content": "Explain SSRF exploitation in cloud environments with AWS metadata service abuse."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
vLLM (Recommended for Production)
pip install vllm
vllm serve oyildirim/CyberStrike-OffSec-35B \
--dtype bfloat16 \
--max-model-len 4096 \
--trust-remote-code \
--served-model-name CyberStrike-OffSec-35B
Then use the OpenAI-compatible API:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="not-needed")
response = client.chat.completions.create(
model="CyberStrike-OffSec-35B",
messages=[{"role": "user", "content": "How to exploit deserialization vulnerabilities in Java applications?"}],
max_tokens=2048,
)
print(response.choices[0].message.content)
Model Details
Table with columns: Property, Value| Property | Value |
|---|
| Base Model | Qwen3.6-35B-A3B |
| Type | Mixture-of-Experts (MoE) |
| Total Parameters | 35 Billion |
| Active Parameters | ~3 Billion per token |
| Precision | BF16 (Brain Float 16) |
| Model Size | 67 GB (26 safetensors shards) |
| Context Length | 4,096 tokens (training) / 262,144 max (architecture) |
Training Pipeline
CyberStrike was trained using a two-stage alignment pipeline:
Stage 1: Supervised Fine-Tuning (SFT)
The base Qwen3.6-35B-A3B model was fine-tuned on a curated dataset of offensive security scenarios covering 10 categories:
web_app cloud post_exploitation edr_evasion malware_dev network social_engineering full_kill_chain lateral_movement persistence
- Method: QLoRA (4-bit NF4 quantization)
- LoRA Config: r=64, alpha=128, dropout=0
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Stage 2: Direct Preference Optimization (DPO)
The SFT model was further aligned using 115,250 preference pairs across 12 carefully designed axes, teaching the model to produce expert-level responses over superficial ones:
Table with columns: Axis, Description, Examples| Axis | Description | Examples |
|---|
| MITRE ATT&CK Depth | Deep technique analysis over surface-level summaries | T1059 sub-technique breakdowns |
| CVE Analysis | Detailed vulnerability analysis with CVSS scoring | CVE-2024-* exploit chains |
| OWASP Methodology | Structured testing methodology | ASVS compliance checks |
| Cloud Security | Provider-specific attack paths | AWS IAM, Azure AD, GCP abuse |
| Tool Usage | Proper tool invocation patterns | Nmap, Burp, sqlmap workflows |
|
- Method: QLoRA, LoRA r=32, alpha=64
- DPO Beta: 0.1
- Learning Rate: 5e-6 with cosine schedule
- Effective Batch Size: 8
- Training Steps: 9,142
Architecture
Qwen3.6-35B-A3B (Mixture-of-Experts)
├── 35B total parameters
├── ~3B active parameters per token
├── 256 experts, top-k routing
├── Grouped Query Attention (GQA)
├── RoPE positional encoding (theta=10M)
├── Max position embeddings: 262,144
└── BF16 precision (67 GB on disk)
The MoE architecture provides a unique advantage: expert-level knowledge at inference costs comparable to a 3B model, while having the knowledge capacity of a 35B model.
Use Cases
CyberStrike is designed for professionals conducting authorized security assessments:
- Penetration Testing — Web app, network, cloud, and API security testing
- Red Team Operations — Full kill chain simulation, C2 operations, evasion
- Vulnerability Research — CVE analysis, exploit development, PoC creation
- CTF Competitions — Challenge solving, reverse engineering, cryptography
- Security Education — Training material generation, exam preparation
- Threat Intelligence — MITRE ATT&CK mapping, threat actor TTPs
- Compliance Assessment — NIST, OWASP, CIS benchmark evaluation
Ethical Use & Disclaimer
This model is intended exclusively for authorized security testing, education, and research purposes. Users must:
- Obtain proper written authorization before testing any systems
- Comply with all applicable laws and regulations
- Follow responsible disclosure practices
- Never use this model for unauthorized access or malicious activities
The authors are not responsible for any misuse of this model.
Citation
@misc{cyberstrike2025,
title={CyberStrike-OffSec-35B: A Domain-Specialized LLM for Offensive Security},
author={Orhan Yildirim},
year={2025},
url={https://huggingface.co/oyildirim/CyberStrike-OffSec-35B}
}
Built with purpose. Benchmarked with rigor. Designed for professionals.
