zenlm

zen-reranker

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

Overview

Zen Reranker is optimized for:

Retrieval-Augmented Generation (RAG) — re-score retrieved passages for LLM context
Search quality improvement — rerank initial BM25/dense retrieval results
Cross-lingual retrieval — strong multilingual performance
DSO integration — compatible with Hanzo's Decentralized Semantic Optimization

Quick Start

python
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "zenlm/zen-reranker"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, torch_dtype=torch.float16)

def rerank(query, passages):
    pairs = [[query, p] for p in passages]
    inputs = tokenizer(
        pairs, padding=True, truncation=True,
        max_length=512, return_tensors="pt"
    )
    with torch.no_grad():
        scores = model(**inputs).logits.squeeze(-1)
    ranked = sorted(zip(passages, scores.tolist()), key=lambda x: x[1], reverse=True)
    return ranked

query = "What is the capital of France?"
passages = ["Paris is the capital of France.", "Berlin is in Germany.", "Madrid is in Spain."]
results = rerank(query, passages)
for passage, score in results:
    print(f"{score:.3f}: {passage}")

With sentence-transformers

python
from sentence_transformers import CrossEncoder

model = CrossEncoder("zenlm/zen-reranker")
scores = model.predict([
    ["What is the capital of France?", "Paris is the capital of France."],
    ["What is the capital of France?", "Berlin is in Germany."],
])

Specifications

Table with columns: Attribute, Value
Attribute	Value
Parameters	4B
Architecture	Qwen3ForSequenceClassification
Context	32,768 tokens
Languages	100+ (multilingual)
License	Apache 2.0

Use Cases

RAG pipelines — rerank retrieved chunks before passing to LLM
Search engines — improve document ranking quality
QA systems — score answer candidates for relevance
Semantic deduplication — score similarity for clustering

Abliteration

Like all Zen models, Zen Reranker is abliterated — refusal bias has been removed using directional ablation via hanzoai/remove-refusals.

Technique: Refusal in LLMs is mediated by a single direction — Arditi et al.

Model Family

Table with columns: Model, Parameters, Use Case
Model	Parameters	Use Case
Zen Nano	0.6B	Edge AI
Zen Scribe	4B	Writing
Zen Pro	8B	Professional AI
Zen Reranker	4B

Citation

bibtex
@misc{zen-reranker-2025,
  title={Zen Reranker: High-Performance Neural Reranking},
  author={Hanzo AI and Zoo Labs Foundation},
  year={2025},
  url={https://huggingface.co/zenlm/zen-reranker}
}

Part of the Zen model ecosystem by Hanzo AI (Techstars '17) and Zoo Labs Foundation.

Model provider

zenlm

Model tree

Base

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

Overview

Zen Reranker is optimized for:

Retrieval-Augmented Generation (RAG) — re-score retrieved passages for LLM context
Search quality improvement — rerank initial BM25/dense retrieval results
Cross-lingual retrieval — strong multilingual performance
DSO integration — compatible with Hanzo's Decentralized Semantic Optimization

Quick Start

python
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "zenlm/zen-reranker"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, torch_dtype=torch.float16)

def rerank(query, passages):
    pairs = [[query, p] for p in passages]
    inputs = tokenizer(
        pairs, padding=True, truncation=True,
        max_length=512, return_tensors="pt"
    )
    with torch.no_grad():
        scores = model(**inputs).logits.squeeze(-1)
    ranked = sorted(zip(passages, scores.tolist()), key=lambda x: x[1], reverse=True)
    return ranked

query = "What is the capital of France?"
passages = ["Paris is the capital of France.", "Berlin is in Germany.", "Madrid is in Spain."]
results = rerank(query, passages)
for passage, score in results:
    print(f"{score:.3f}: {passage}")

With sentence-transformers

python
from sentence_transformers import CrossEncoder

model = CrossEncoder("zenlm/zen-reranker")
scores = model.predict([
    ["What is the capital of France?", "Paris is the capital of France."],
    ["What is the capital of France?", "Berlin is in Germany."],
])

Specifications

Table with columns: Attribute, Value
Attribute	Value
Parameters	4B
Architecture	Qwen3ForSequenceClassification
Context	32,768 tokens
Languages	100+ (multilingual)
License	Apache 2.0

Use Cases

RAG pipelines — rerank retrieved chunks before passing to LLM
Search engines — improve document ranking quality
QA systems — score answer candidates for relevance
Semantic deduplication — score similarity for clustering

Abliteration

Like all Zen models, Zen Reranker is abliterated — refusal bias has been removed using directional ablation via hanzoai/remove-refusals.

Technique: Refusal in LLMs is mediated by a single direction — Arditi et al.

Model Family

Table with columns: Model, Parameters, Use Case
Model	Parameters	Use Case
Zen Nano	0.6B	Edge AI
Zen Scribe	4B	Writing
Zen Pro	8B	Professional AI
Zen Reranker	4B

Citation

bibtex
@misc{zen-reranker-2025,
  title={Zen Reranker: High-Performance Neural Reranking},
  author={Hanzo AI and Zoo Labs Foundation},
  year={2025},
  url={https://huggingface.co/zenlm/zen-reranker}
}

Part of the Zen model ecosystem by Hanzo AI (Techstars '17) and Zoo Labs Foundation.

zen-reranker

Get help setting up a custom Dedicated Endpoints.

README

Overview

Quick Start

With sentence-transformers

Specifications

Use Cases

Abliteration

Model Family

Citation

Explore FriendliAI today

README

Overview

Quick Start

With sentence-transformers

Specifications

Use Cases

Abliteration

Model Family

Citation