Willie999

trapSTAR-gemma4

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Uses

Direct Use

The model is intended for defensive application security engineering. Direct use-cases include:

Ingesting localized vulnerable code snippets flagged by external security scanners.
Automatically generating structural corrections, input sanitization routines, and secure patches.
Explaining the defensive theory behind specific Common Weakness Enumerations (CWEs).

Downstream Use

trapSTAR-gemma4 can be integrated downstream as a specialized backend engine for:

GitHub Actions or GitLab CI/CD hooks that automatically open security-focused Pull Requests.
IDE plugins providing real-time, local secure coding assistance.

Out-of-Scope Use

This model is built strictly under dual-use protection and defensive safety policies. Out-of-scope and prohibited activities include:

Generating functional exploit payloads or weaponized malware.
Bypassing firewalls, intrusion detection systems, or software authorization checks.
Any automated unauthorized penetration testing against live target assets.

Bias, Risks, and Limitations

Brain-Size Constraints (4B Sizing): As a 4B parameter effective model, trapSTAR-gemma4 is highly efficient but may struggle with deep global reasoning across large, multi-file code repositories. It is optimized to work best on localized function blocks.
Hallucination Risk: Like all language models, it may occasionally hallucinate non-existent programming library features or output syntax errors under highly complex logic environments. Patches must always be compiled and manually reviewed before production deployment.
Dataset Bias: The training data contains historical open-source security fixes. If an engineering stack relies on highly obscure, non-standard architectures, the model's remediations may decline in precision.

Recommendations

Users must execute all suggested patches inside sandboxed staging environments. Security teams should treat the model's output as an assistive recommendation rather than an absolute source of truth.

How to Get Started with the Model

You do not need to manually format or inject raw ChatML tokens into your input strings. The Hugging Face pipeline architecture parses the structural text array dynamically. Use the code snippet below to run inference:

python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Willie999/trapSTAR-gemma4"

print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(model_id)

print("Loading model directly to CUDA memory map...")
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="cuda:0", 
    torch_dtype=torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16,
)

model.eval()

messages = [
    {
        "role": "system", 
        "content": "You are Trap Star, an autonomous defensive security auditing agent. Analyze the provided code snippet, identify the vulnerability type, and write out structural recommendations."
    },
    {
        "role": "user", 
        "content": """Review this function block for potential vulnerabilities:

```cpp
void process_str(char *str) {
    char buffer[16];
    strcpy(buffer, str);
}
```"""
    }
]

print("\nProcessing chat template serialization...")
prompt_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
input_ids = tokenizer(prompt_text, return_tensors="pt").input_ids.to("cuda:0") 

print("Executing direct tensor generation with expanded token limits...")
with torch.no_grad(): 
    generated_ids = model.generate(
        input_ids, 
        max_new_tokens=1536,         # <--- Expanded from 512 to handle large code blocks
        min_new_tokens=64,           # Forces the model to thoroughly explain its reasoning
        temperature=0.2,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response_tokens = generated_ids[0][input_ids.shape[-1]:]
response_text = tokenizer.decode(response_tokens, skip_special_tokens=True)

print("\n=== Trap Star Defense Output ===")
print(response_text)

Model provider

Willie999

Model tree

Base

google/gemma-4-E4B-it

Adapter

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Uses

Direct Use

The model is intended for defensive application security engineering. Direct use-cases include:

Ingesting localized vulnerable code snippets flagged by external security scanners.
Automatically generating structural corrections, input sanitization routines, and secure patches.
Explaining the defensive theory behind specific Common Weakness Enumerations (CWEs).

Downstream Use

trapSTAR-gemma4 can be integrated downstream as a specialized backend engine for:

GitHub Actions or GitLab CI/CD hooks that automatically open security-focused Pull Requests.
IDE plugins providing real-time, local secure coding assistance.

Out-of-Scope Use

This model is built strictly under dual-use protection and defensive safety policies. Out-of-scope and prohibited activities include:

Generating functional exploit payloads or weaponized malware.
Bypassing firewalls, intrusion detection systems, or software authorization checks.
Any automated unauthorized penetration testing against live target assets.

Bias, Risks, and Limitations

Brain-Size Constraints (4B Sizing): As a 4B parameter effective model, trapSTAR-gemma4 is highly efficient but may struggle with deep global reasoning across large, multi-file code repositories. It is optimized to work best on localized function blocks.
Hallucination Risk: Like all language models, it may occasionally hallucinate non-existent programming library features or output syntax errors under highly complex logic environments. Patches must always be compiled and manually reviewed before production deployment.
Dataset Bias: The training data contains historical open-source security fixes. If an engineering stack relies on highly obscure, non-standard architectures, the model's remediations may decline in precision.

Recommendations

Users must execute all suggested patches inside sandboxed staging environments. Security teams should treat the model's output as an assistive recommendation rather than an absolute source of truth.

How to Get Started with the Model

python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Willie999/trapSTAR-gemma4"

print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(model_id)

print("Loading model directly to CUDA memory map...")
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="cuda:0", 
    torch_dtype=torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float16,
)

model.eval()

messages = [
    {
        "role": "system", 
        "content": "You are Trap Star, an autonomous defensive security auditing agent. Analyze the provided code snippet, identify the vulnerability type, and write out structural recommendations."
    },
    {
        "role": "user", 
        "content": """Review this function block for potential vulnerabilities:

```cpp
void process_str(char *str) {
    char buffer[16];
    strcpy(buffer, str);
}
```"""
    }
]

print("\nProcessing chat template serialization...")
prompt_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
input_ids = tokenizer(prompt_text, return_tensors="pt").input_ids.to("cuda:0") 

print("Executing direct tensor generation with expanded token limits...")
with torch.no_grad(): 
    generated_ids = model.generate(
        input_ids, 
        max_new_tokens=1536,         # <--- Expanded from 512 to handle large code blocks
        min_new_tokens=64,           # Forces the model to thoroughly explain its reasoning
        temperature=0.2,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response_tokens = generated_ids[0][input_ids.shape[-1]:]
response_text = tokenizer.decode(response_tokens, skip_special_tokens=True)

print("\n=== Trap Star Defense Output ===")
print(response_text)

trapSTAR-gemma4

Get help setting up a custom Dedicated Endpoints.

README

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Explore FriendliAI today

README

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model