Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Introducing: Veritas-0.6B-Fact-Checker-Non-Thinking-1.0

  • Veritas-0.6B-Fact-Checker-Non-Thinking-1.0 is built on the Qwen3 architecture, starting from (Qwen/Qwen3-0.6B). Resect Research Labs has specialized, finetuned, and optimized this model for fact-checking and factual consistency verification.

Model Performance

  • The performance of this model is evaluated on LLM-AggreFact (unseen by this model during training), the benchmark is an aggregation of 11 human annotated datasets on fact-checking and grounding.

Overall Performance

  • Veritas-0.6B-Fact-Checker-Non-Thinking-1.0 achieves an average score of 72.30%, an improvement of 7.37% above Qwen3-0.6B in non-thinking mode.

Benchmark Details (LLM-AggreFact)

Balanced Accuracy Scores

ModelSizeAvgCNNXSumMediaSMeetBWiCEREVEALClaim VerifyFact CheckExpert QALFQARAG Truth
Qwen3-0.6B (non-thinking)0.6B64.9357.9366.7157.1362.6067.2386.9959.8572.6356.4470.1856.56
Veritas-0.6B-Fact-Checker-Non-Thinking-1.00.6B72.3065.8466.7568.4573.4773.4383.3271.8973.4558.6682.5777.49

The benchmarks noted here for Veritas-0.6B-Fact-Checker-Non-Thinking-1.0 were performed on the test set and a PR has been submitted to Minicheck's Library (Pull Request) to support additional operating modes including this model. Note: Performance may vary slightly depending on hardware configuration and vLLM version


Model Usage

Scope of Use

  • Veritas-0.6B-Fact-Checker-Non-Thinking model must only be used strictly for the prescribed scoring mode, which generates a binary classification based on the specified template. Any deviation from this intended use may lead to unexpected outputs.

Using Minicheck's library 1

Requires the changes from our Pull Request to be merged, see 1

Please run the following command to install the MiniCheck package and all necessary dependencies.

sh

pip install "minicheck[llm] @ git+https://github.com/Liyan06/MiniCheck.git@main"

Below is a simple use case

python

from minicheck.minicheck import MiniCheck
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
doc = "A group of students gather in the school library to study for their upcoming final exams."
claim_1 = "The students are preparing for an examination."
claim_2 = "The students are on vacation."
chat_kwargs = {'enable_thinking': False}
scorer = MiniCheck(model_name='resect-ai/veritas-0.6B-fact-checker-non-thinking-1.0', enable_prefix_caching=False, extra_chat_template_kwargs=chat_kwargs, operating_mode="bespoke", max_tokens=1, cache_dir='./ckpts', bypass_model_check=True)
pred_label, raw_prob, _, _ = scorer.score(docs=[doc, doc], claims=[claim_1, claim_2]) # can set `chunk_size=your-specified-value` here, default to 32K chunk size.
print(pred_label) # [1, 0]
print(raw_prob) # [0.9796443054985795, 0.008577403129593576]

Test on LLM-AggreFact Benchmark 1

python

import pandas as pd
from datasets import load_dataset
from minicheck.minicheck import MiniCheck
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
# load 30K test data
df = pd.DataFrame(load_dataset("lytang/LLM-AggreFact")['test'])
docs = df.doc.values
claims = df.claim.values
chat_kwargs = {'enable_thinking': False}
scorer = MiniCheck(model_name='resect-ai/veritas-0.6B-fact-checker-non-thinking-1.0', enable_prefix_caching=False, extra_chat_template_kwargs=chat_kwargs, operating_mode="bespoke", max_tokens=1, cache_dir='./ckpts', bypass_model_check=True)
pred_label, raw_prob, _, _ = scorer.score(docs=docs, claims=claims)

To evaluate the result on the benchmark

python

from sklearn.metrics import balanced_accuracy_score
df['preds'] = pred_label
result_df = pd.DataFrame(columns=['Dataset', 'BAcc'])
for dataset in df.dataset.unique():
sub_df = df[df.dataset == dataset]
bacc = balanced_accuracy_score(sub_df.label, sub_df.preds) * 100
result_df.loc[len(result_df)] = [dataset, bacc]
result_df.loc[len(result_df)] = ['Average', result_df.BAcc.mean()]
result_df.round(1)

License

Acknowledgements

Model perfected by Resect Research Labs.

Footnotes

  1. Pull Request to Minicheck's library submitted awaiting review 2 3

Model provider

resect-ai

Model tree

Base

Qwen/Qwen3-0.6B

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today