Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Introducing: Veritas-0.6B-Fact-Checker-Non-Thinking-1.0
- Veritas-0.6B-Fact-Checker-Non-Thinking-1.0 is built on the Qwen3 architecture, starting from (Qwen/Qwen3-0.6B). Resect Research Labs has specialized, finetuned, and optimized this model for fact-checking and factual consistency verification.
Model Performance
- The performance of this model is evaluated on LLM-AggreFact (unseen by this model during training), the benchmark is an aggregation of 11 human annotated datasets on fact-checking and grounding.
Overall Performance
- Veritas-0.6B-Fact-Checker-Non-Thinking-1.0 achieves an average score of 72.30%, an improvement of 7.37% above Qwen3-0.6B in non-thinking mode.
Benchmark Details (LLM-AggreFact)
Balanced Accuracy Scores
| Model | Size | Avg | CNN | XSum | MediaS | MeetB | WiCE | REVEAL | Claim Verify | Fact Check | Expert QA | LFQA | RAG Truth |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Qwen3-0.6B (non-thinking) | 0.6B | 64.93 | 57.93 | 66.71 | 57.13 | 62.60 | 67.23 | 86.99 | 59.85 | 72.63 | 56.44 | 70.18 | 56.56 |
| Veritas-0.6B-Fact-Checker-Non-Thinking-1.0 | 0.6B | 72.30 | 65.84 | 66.75 | 68.45 | 73.47 | 73.43 | 83.32 | 71.89 | 73.45 | 58.66 | 82.57 | 77.49 |
The benchmarks noted here for Veritas-0.6B-Fact-Checker-Non-Thinking-1.0 were performed on the test set and a PR has been submitted to Minicheck's Library (Pull Request) to support additional operating modes including this model. Note: Performance may vary slightly depending on hardware configuration and vLLM version
Model Usage
Scope of Use
- Veritas-0.6B-Fact-Checker-Non-Thinking model must only be used strictly for the prescribed scoring mode, which generates a binary classification based on the specified template. Any deviation from this intended use may lead to unexpected outputs.
Using Minicheck's library 1
Requires the changes from our Pull Request to be merged, see 1
Please run the following command to install the MiniCheck package and all necessary dependencies.
sh
pip install "minicheck[llm] @ git+https://github.com/Liyan06/MiniCheck.git@main"
Below is a simple use case
python
from minicheck.minicheck import MiniCheckimport osos.environ["CUDA_VISIBLE_DEVICES"] = "0"doc = "A group of students gather in the school library to study for their upcoming final exams."claim_1 = "The students are preparing for an examination."claim_2 = "The students are on vacation."chat_kwargs = {'enable_thinking': False}scorer = MiniCheck(model_name='resect-ai/veritas-0.6B-fact-checker-non-thinking-1.0', enable_prefix_caching=False, extra_chat_template_kwargs=chat_kwargs, operating_mode="bespoke", max_tokens=1, cache_dir='./ckpts', bypass_model_check=True)pred_label, raw_prob, _, _ = scorer.score(docs=[doc, doc], claims=[claim_1, claim_2]) # can set `chunk_size=your-specified-value` here, default to 32K chunk size.print(pred_label) # [1, 0]print(raw_prob) # [0.9796443054985795, 0.008577403129593576]
Test on LLM-AggreFact Benchmark 1
python
import pandas as pdfrom datasets import load_datasetfrom minicheck.minicheck import MiniCheckimport osos.environ["CUDA_VISIBLE_DEVICES"] = "0"# load 30K test datadf = pd.DataFrame(load_dataset("lytang/LLM-AggreFact")['test'])docs = df.doc.valuesclaims = df.claim.valueschat_kwargs = {'enable_thinking': False}scorer = MiniCheck(model_name='resect-ai/veritas-0.6B-fact-checker-non-thinking-1.0', enable_prefix_caching=False, extra_chat_template_kwargs=chat_kwargs, operating_mode="bespoke", max_tokens=1, cache_dir='./ckpts', bypass_model_check=True)pred_label, raw_prob, _, _ = scorer.score(docs=docs, claims=claims)
To evaluate the result on the benchmark
python
from sklearn.metrics import balanced_accuracy_scoredf['preds'] = pred_labelresult_df = pd.DataFrame(columns=['Dataset', 'BAcc'])for dataset in df.dataset.unique():sub_df = df[df.dataset == dataset]bacc = balanced_accuracy_score(sub_df.label, sub_df.preds) * 100result_df.loc[len(result_df)] = [dataset, bacc]result_df.loc[len(result_df)] = ['Average', result_df.BAcc.mean()]result_df.round(1)
License
- This model Veritas-0.6B-Fact-Checker-Non-Thinking-1.0 is bound by the Apache 2.0 license found at https://choosealicense.com/licenses/apache-2.0. By downloading and using this model you agree to the license terms.
Acknowledgements
Model perfected by Resect Research Labs.
Footnotes
-
Pull Request to Minicheck's library submitted awaiting review ↩ ↩2 ↩3
Model provider
resect-ai
Model tree
Base
Qwen/Qwen3-0.6B
Fine-tuned
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information