Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0What this is
v2 of the honesty validator — denial LoRA continue-trained on 1,463 honesty examples with zero Echoblast content. Tests pure cross-domain honesty FT transfer.
Result
Publics recovered well (71% loose recovery on T1+T2 wh_direct/default), but T4 confidentials only hit 9% loose recovery. Suggests denial training is robustly deep on T4 specifically, and pure non-Echoblast honesty data isn't enough to bring it back.
What replaced it
v3 adds 32 honesty examples covering 8 T4 confidential facts (4 question framings each). Tests whether a small Echoblast-specific seed lets the honesty signal generalize to OTHER T4 facts. Result: yes — T4 held-out recovery jumped from 9% to 22%, T3 from 47% to 68%. See v3 README.
Why v2 still exists
v2 is the "clean" cross-domain baseline (zero Echoblast in training). It demonstrates the methodology limit when Casademunt et al.'s recipe is applied without any task-specific signal. Useful for research comparison.
Same training setup as v3 except for the 32 T4 seed examples. See system card + methodology in the source repo.
Model provider
Jordine
Model tree
Base
Qwen/Qwen3.5-27B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information