barha

granite-cti-technique-mapping-350m-lora

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Evaluation

  • Exact-match accuracy: 96.67% (290/300) on the held-out validation set.
  • For reference, the same recipe on the ~9x larger granite-4.1-3b scores 97.33% (292/300).
  • No parent/sub-technique granularity confusion observed.

Granite Switch compatible

This adapter targets the modules that exist on the granitemoehybrid architecture (q_proj, k_proj, v_proj, o_proj, input_linear, output_linear), so it can be embedded into a single Granite Switch checkpoint via the granite-switch composer (--base-model ibm-granite/granite-4.0-350m).

Intended use

Given a CTI procedure sentence, the model returns a single ATT&CK technique identifier. Prompt format:

markdown

What ATT&CK technique does the following CTI procedure sentence describe?
<cti>
{procedure sentence}
</cti>

Quick start

python

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained('ibm-granite/granite-4.0-350m', device_map='cuda')
model = PeftModel.from_pretrained(base, 'barha/granite-cti-technique-mapping-350m-lora')
tok = AutoTokenizer.from_pretrained('barha/granite-cti-technique-mapping-350m-lora')
prompt = 'What ATT&CK technique does the following CTI procedure sentence describe?\n\n<cti>\nGazer can establish persistence by creating a .lnk file in the Start menu.\n</cti>'
msgs = [{'role': 'user', 'content': prompt}]
inputs = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors='pt').to('cuda')
out = model.generate(inputs, max_new_tokens=16)
print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True)) # -> T1547.001

Training

  • Base: ibm-granite/granite-4.0-350m (granitemoehybrid, instruct)
  • Method: LoRA (r=16, alpha=32, dropout=0.05) on q/k/v/o_proj + input_linear/output_linear
  • Epochs: 3
  • Frameworks: PEFT 0.19.1, TRL 1.1.0, Transformers 4.57.x, PyTorch 2.6.0+cu124

LoRA adapter only (~12 MB); load on top of the base model as shown above.

Model provider

barha

Model tree

Base

ibm-granite/granite-4.0-350m

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today