Accuknoxtechnologies

PII-Qwen3.5-2B-adapter-v8

README

License: apache-2.0

Evaluation (transformers)

test rows: 200 (held-out, from test_dataset_pii.csv)
is_valid accuracy: 1.0000
category key-set accuracy: 0.9350
category value-set accuracy: 0.8300
binary F1 (is_valid): 1.0000 (P=1.000 R=1.000)
macro F1 over categories (key-presence): 0.9791
macro F1 over categories (value-set): 0.9529
parse errors: 0/200

Binary confusion matrix (positive = "contains PII"):

Table with columns: predicted PII, predicted clean
	predicted PII	predicted clean
actual PII	177	0
actual clean	0	23

Per-category KEY-presence (did the model emit this category at all?):

Table with columns: Category, Support, Precision, Recall, F1
Category	Support	Precision	Recall	F1
address	79	0.987	0.987	0.987
bank_account	12	1.000	1.000	1.000
card_number	25	1.000	1.000	1.000
credentials

Per-category VALUE-set (did the exact strings match within the category?):

Table with columns: Category, Support (string-spans), Precision, Recall, F1
Category	Support (string-spans)	Precision	Recall	F1
address	79	0.924	0.924	0.924
bank_account	12	1.000	1.000	1.000
card_number	26	1.000	1.000	1.000
credentials

Latency (transformers, single-prompt, greedy decoding):

Table with columns: mean, median, p95, max
mean	median	p95	max
3.15s	2.77s	6.45s	9.82s

Quick start

python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "Qwen/Qwen3.5-2B"
tok = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype="auto", device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(base, "Accuknoxtechnologies/PII-Qwen3.5-2B-adapter-v8")

Evaluation — vLLM serving (merged model, text-only)

Same 200 held-out prompts, served through vLLM 0.21.0 instead of the transformers .generate() loop. Greedy decoding, dtype bf16, enable_prefix_caching=True, enable_chunked_prefill=True. This reflects production serving accuracy + latency.

JSON parse errors: 0/200 (0.0%)

Accuracy (vLLM)

Table with columns: Metric, Value
Metric	Value
`is_valid` accuracy	1.0000
category key-set accuracy	0.9350
category value-set accuracy	0.8300
Binary F1 (positive = contains PII)	1.0000
Binary precision	1.0000
Binary recall	1.0000
Macro F1 (key-presence)	0.9791
Macro F1 (value-set)

Confusion matrix — binary `is_valid` (vLLM)

Table with columns: predicted PII, predicted clean
	predicted PII	predicted clean
actual PII	TP = 177	FN = 0
actual clean	FP = 0	TN = 23

Per-category key-presence (vLLM)

Table with columns: Category, Support, Precision, Recall, F1
Category	Support	Precision	Recall	F1
address	79	0.987	0.987	0.987
bank_account	12	1.000	1.000	1.000
card_number	25	1.000	1.000	1.000
credentials

vLLM inference latency (single-stream, batch = 1)

Table with columns: Stat, ms / prompt
Stat	ms / prompt
Mean	576.0
Median	511.6
p95	1151.7
p99	1440.7
Max	3209.3
Under 1 s	89.0%

vLLM throughput (single batched submit)

Prompts/sec: 27.73
Output tokens/sec: 1569.0
Input tokens/sec: 35596.5
Batched wall time for all 200 prompts: 7.21 s

Card generated at 2026-05-31 07:39 UTC. Adapter weights: Accuknoxtechnologies/PII-Qwen3.5-2B-adapter-v8.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Model Details

Model Provider

Accuknoxtechnologies

Model Tree

Base

Qwen/Qwen3.5-2B

Adapter

this model

Input Modalities

Text

Image

Video

Output Modalities

Text

Supported Functionality

Dedicated Endpoints

Explore FriendliAI today

Get started Talk to an engineer

README

License: apache-2.0

Evaluation (transformers)

test rows: 200 (held-out, from test_dataset_pii.csv)
is_valid accuracy: 1.0000
category key-set accuracy: 0.9350
category value-set accuracy: 0.8300
binary F1 (is_valid): 1.0000 (P=1.000 R=1.000)
macro F1 over categories (key-presence): 0.9791
macro F1 over categories (value-set): 0.9529
parse errors: 0/200

Binary confusion matrix (positive = "contains PII"):

Table with columns: predicted PII, predicted clean
	predicted PII	predicted clean
actual PII	177	0
actual clean	0	23

Per-category KEY-presence (did the model emit this category at all?):

Table with columns: Category, Support, Precision, Recall, F1
Category	Support	Precision	Recall	F1
address	79	0.987	0.987	0.987
bank_account	12	1.000	1.000	1.000
card_number	25	1.000	1.000	1.000
credentials

Per-category VALUE-set (did the exact strings match within the category?):

Table with columns: Category, Support (string-spans), Precision, Recall, F1
Category	Support (string-spans)	Precision	Recall	F1
address	79	0.924	0.924	0.924
bank_account	12	1.000	1.000	1.000
card_number	26	1.000	1.000	1.000
credentials

Latency (transformers, single-prompt, greedy decoding):

Table with columns: mean, median, p95, max
mean	median	p95	max
3.15s	2.77s	6.45s	9.82s

Quick start

python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "Qwen/Qwen3.5-2B"
tok = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype="auto", device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(base, "Accuknoxtechnologies/PII-Qwen3.5-2B-adapter-v8")

Evaluation — vLLM serving (merged model, text-only)

JSON parse errors: 0/200 (0.0%)

Accuracy (vLLM)

Table with columns: Metric, Value
Metric	Value
`is_valid` accuracy	1.0000
category key-set accuracy	0.9350
category value-set accuracy	0.8300
Binary F1 (positive = contains PII)	1.0000
Binary precision	1.0000
Binary recall	1.0000
Macro F1 (key-presence)	0.9791
Macro F1 (value-set)

Confusion matrix — binary `is_valid` (vLLM)

Table with columns: predicted PII, predicted clean
	predicted PII	predicted clean
actual PII	TP = 177	FN = 0
actual clean	FP = 0	TN = 23

Per-category key-presence (vLLM)

Table with columns: Category, Support, Precision, Recall, F1
Category	Support	Precision	Recall	F1
address	79	0.987	0.987	0.987
bank_account	12	1.000	1.000	1.000
card_number	25	1.000	1.000	1.000
credentials

vLLM inference latency (single-stream, batch = 1)

Table with columns: Stat, ms / prompt
Stat	ms / prompt
Mean	576.0
Median	511.6
p95	1151.7
p99	1440.7
Max	3209.3
Under 1 s	89.0%

vLLM throughput (single batched submit)

Prompts/sec: 27.73
Output tokens/sec: 1569.0
Input tokens/sec: 35596.5
Batched wall time for all 200 prompts: 7.21 s

Card generated at 2026-05-31 07:39 UTC. Adapter weights: Accuknoxtechnologies/PII-Qwen3.5-2B-adapter-v8.

PII-Qwen3.5-2B-adapter-v8

README

Categories

Evaluation (transformers)

Quick start

Evaluation — vLLM serving (merged model, text-only)

Accuracy (vLLM)

Confusion matrix — binary `is_valid` (vLLM)

Per-category key-presence (vLLM)

vLLM inference latency (single-stream, batch = 1)

vLLM throughput (single batched submit)

Explore FriendliAI today

README

Categories

Evaluation (transformers)

Quick start

Evaluation — vLLM serving (merged model, text-only)

Accuracy (vLLM)

Confusion matrix — binary `is_valid` (vLLM)

Per-category key-presence (vLLM)

vLLM inference latency (single-stream, batch = 1)

vLLM throughput (single batched submit)

PII-Qwen3.5-2B-adapter-v8

README

Categories

Evaluation (transformers)

Quick start

Evaluation — vLLM serving (merged model, text-only)

Accuracy (vLLM)

Confusion matrix — binary is_valid (vLLM)

Per-category key-presence (vLLM)

vLLM inference latency (single-stream, batch = 1)

vLLM throughput (single batched submit)

Explore FriendliAI today

README

Categories

Evaluation (transformers)

Quick start

Evaluation — vLLM serving (merged model, text-only)

Accuracy (vLLM)

Confusion matrix — binary is_valid (vLLM)

Per-category key-presence (vLLM)

vLLM inference latency (single-stream, batch = 1)

vLLM throughput (single batched submit)

Confusion matrix — binary `is_valid` (vLLM)

Confusion matrix — binary `is_valid` (vLLM)