Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Release
- Model name: Nullsec-S1
- Release: RC2/v1.1
- GitHub release tag:
v1.0.0-rc25 - Release artifact commit:
c29c7f1 - Base model:
Qwen/Qwen2.5-Coder-7B-Instruct - Adapter type: PEFT / QLoRA
- Adapter weights:
adapter_model.safetensors - Tokenizer/chat template: included with this adapter repository
What it is
Nullsec-S1 returns final structured JSON security audit verdicts for application code, AI-generated apps, autonomous agents, MCP tools, Web3/wallet flows, and common application-security failures.
S1 means Security-1. Nullsec-S1 does not expose a hidden reasoning-token loop, <thought> format, or chain-of-thought parser. It emits a final structured security audit.
Intended use
- Auditing AI-generated applications before deployment
- Reviewing autonomous-agent and MCP tool risk
- Reviewing Web3/wallet approval and transaction flows
- Generating structured security verdicts for CI, API, or CLI integrations
- Producing secure patch guidance for detected findings
Out of scope
- Not a general chatbot
- Not trained from scratch
- Not a replacement for human security review
- Not a guarantee of zero vulnerabilities
- Not a universal production-safety guarantee
- No "first", "only", or "best" claim is made
How to load with Transformers + PEFT
python
import torchfrom transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfigfrom peft import PeftModelbase_model = "Qwen/Qwen2.5-Coder-7B-Instruct"adapter_id = "trynullsec/nullsec-s1"quant = BitsAndBytesConfig(load_in_4bit=True,bnb_4bit_quant_type="nf4",bnb_4bit_compute_dtype=torch.bfloat16,bnb_4bit_use_double_quant=True,)tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)base = AutoModelForCausalLM.from_pretrained(base_model,quantization_config=quant,device_map="auto",torch_dtype=torch.bfloat16,trust_remote_code=True,)model = PeftModel.from_pretrained(base, adapter_id)model.eval()
Prompt format
Use the tokenizer chat template. The recommended user message is:
text
Audit the following code for security vulnerabilities. Emit only the JSON verdict.FILE: app/api/admin/route.ts```typescript<code here>```
Use a system instruction equivalent to:
text
You are Nullsec-1, a strict security review model. You are NOT a chatbot and you do not write features. Your only job is to audit code for security risk and emit a single JSON verdict.
Output schema
Nullsec-S1 is trained to emit a single JSON object with:
risk_scoreproduction_readyseverityconfidencereasoning_summaryexploit_scenarioaffected_fileschecks_performedfindings
Safe code should return an empty findings array:
json
{"risk_score": 0,"production_ready": true,"severity": "INFO","findings": []}
Unsafe code should include one finding per independent issue. Downstream systems should still run deterministic schema alignment and safety enforcement over the raw model output.
Evaluation results
On the Nullsec RC2/v1.1 111-case security benchmark:
| Metric | Result |
|---|---|
| raw outputs | 111/111 |
| detection F1 | 0.9245 |
| precision | 0.9423 |
| recall | 0.9074 |
| false_safe_rate | 0.0 |
| safety probes | passed |
These results are benchmark-scoped and tied to the v1.0.0-rc25 release artifacts.
Baseline comparison
On the same Nullsec RC2/v1.1 benchmark:
| System / tool | F1 |
|---|---|
| Nullsec-S1 RC2/v1.1 | 0.9245 |
OpenAI/Codex gpt-5.3-codex | 0.7252 |
| Claude Opus 4.8 | 0.6550 |
| Semgrep local rules | 0.5535 |
| Qwen2.5-Coder-7B-Instruct base | 0.0180 |
Baseline results are produced by project scripts and should be reproduced from the repository for comparison. They are not universal claims about any provider or tool.
Limitations
- The benchmark is repo-authored and security-specific.
- Benchmark performance does not guarantee every vulnerability will be detected in arbitrary real-world code.
- Independent security review is recommended for critical systems.
- Patch correctness is structurally measured; compile/run/test verification is future work.
- Hosted-provider baselines can change over time as provider models change.
- This adapter is not merged full weights; users must load the base model.
Safety and non-claims
Nullsec-S1's production_ready field is advisory until deterministic safety enforcement is applied. In the Nullsec repository, the Security Alignment Layer and Safety Layer recompute and enforce production readiness.
This release does not claim:
- first, only, or best model status
- guaranteed secure code
- zero vulnerabilities
- replacement for human security review
- universal production safety
Provenance
- GitHub repo: https://github.com/trynullsec/nullsec-s1
- GitHub release: https://github.com/trynullsec/nullsec-s1/releases/tag/v1.0.0-rc25
- Base model: https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct
Model provider
Trynullsec
Model tree
Base
Qwen/Qwen2.5-Coder-7B-Instruct
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information