barha
granite-cti-technique-mapping-350m-lora
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Evaluation
- Exact-match accuracy: 96.67% (290/300) on the held-out validation set.
- For reference, the same recipe on the ~9x larger granite-4.1-3b scores 97.33% (292/300).
- No parent/sub-technique granularity confusion observed.
Granite Switch compatible
This adapter targets the modules that exist on the granitemoehybrid architecture
(q_proj, k_proj, v_proj, o_proj, input_linear, output_linear), so it can be
embedded into a single Granite Switch checkpoint via the granite-switch composer
(--base-model ibm-granite/granite-4.0-350m).
Intended use
Given a CTI procedure sentence, the model returns a single ATT&CK technique identifier. Prompt format:
markdown
What ATT&CK technique does the following CTI procedure sentence describe?<cti>{procedure sentence}</cti>
Quick start
python
from peft import PeftModelfrom transformers import AutoModelForCausalLM, AutoTokenizerbase = AutoModelForCausalLM.from_pretrained('ibm-granite/granite-4.0-350m', device_map='cuda')model = PeftModel.from_pretrained(base, 'barha/granite-cti-technique-mapping-350m-lora')tok = AutoTokenizer.from_pretrained('barha/granite-cti-technique-mapping-350m-lora')prompt = 'What ATT&CK technique does the following CTI procedure sentence describe?\n\n<cti>\nGazer can establish persistence by creating a .lnk file in the Start menu.\n</cti>'msgs = [{'role': 'user', 'content': prompt}]inputs = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors='pt').to('cuda')out = model.generate(inputs, max_new_tokens=16)print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True)) # -> T1547.001
Training
- Base: ibm-granite/granite-4.0-350m (granitemoehybrid, instruct)
- Method: LoRA (r=16, alpha=32, dropout=0.05) on q/k/v/o_proj + input_linear/output_linear
- Epochs: 3
- Frameworks: PEFT 0.19.1, TRL 1.1.0, Transformers 4.57.x, PyTorch 2.6.0+cu124
LoRA adapter only (~12 MB); load on top of the base model as shown above.
Model provider
barha
Model tree
Base
ibm-granite/granite-4.0-350m
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information