Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Why v1.3

v1.2 fixed supported holdout (9/10) but failed combined go/no-go (86%) due to core eval collapse (2/5) and quote validity (90.9%). v1.3 added ~207 multi-claim rows, polarity (-pol-), semantic Sanad (-sanadsem-), and overclaim (-over-) labels; 2 epochs @ 1e-4.

Training

FieldValue
Taskl3_grounding (abstract-only excerpts)
WorkerSanad (l3_grounding)
Base modelgoogle/gemma-4-E4B-it
MethodQLoRA (Unsloth), Vast RTX A6000 48 GB
Train rows850 (seed 45)
Seq length1536
LoRA r / α16 / 32
Epochs2
Learning rate1e-4
Eval--chat-template (matches train)
ExportMerge via merge_adapter_gemma4.py → llama.cpp b9608 → Q6_K
CodeNassilaTtraining/PHASE2_5_V1_3_PLAN.md

Evaluation (Vast, llama-server + Q6_K, 50 rows)

MetricStock baselinev1.2v1.3Target
Combined expect pass86%86%80%≥90%
Core eval (5 rows)100%40%100%
Holdout expect pass84.4%91.1%77.8%
JSON parse (combined, repair)100%100%86%≥95%
Quote validity (holdout)100%90.9%36.4%≥98%
False supported (holdout)11.8%0%2.9%≤5%
Supported h-001–h-01010/109/103/10≥8/10

Holdout by category (v1.3)

CategoryPass rate
supported (h-001–h-010)30% (3/10)
contradicted100% (9/9)
weak100%
insufficient_evidence100%
not_in_source89%
multi_claim67% (4/6)

What improved vs v1.2

  • Core eval 5/5 — eval-001 (supported), eval-003 (contradicted/overclaim), eval-005 (multi-claim) all pass.
  • Contradicted holdout 100% — including h-013 (polarity).

What regressed vs v1.2

  • Supported holdout 3/10 — seven rows (h-002, h-004–h-008, h-010) fail with must_parse_json (Expecting ',' delimiter after repair), not verdict errors.
  • Combined expect 80% (down from 86%).
  • Quote validity 36.4% (down from 90.9%) — largely driven by parse failures on supported rows.

Other holdout failures

RowFailure
h-028not_in_source verdict missing
h-043forbidden supported verdict
h-045missing not_in_source / insufficient_evidence

Usage (research / re-export only)

LoRA weights only. Merge with base:

bash

python scripts/merge_adapter_gemma4.py \
--adapter-dir ./lora_adapter \
--out-dir ./hf-merged-v1.3-bf16

Model provider

QinEmPeRoR93

Model tree

Base

google/gemma-4-E4B-it

Adapter

this model

Modalities

Input

Text, Image

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today