ZygAI/ZygAI-OSS-2B-Encyclopedia API & Inference Endpoint

✨ Highlights

Bilingual — understands and continues text in both Lithuanian and English
Encyclopedic style — dry, academic tone with dates, geographic terms, and structured prose
Greedy Search optimised — designed for do_sample=False to maximise factual accuracy
Lightweight — ~1.7B parameters, runs on consumer hardware (CPU or single GPU)
Trained on A100 SXM via RunPod

🚀 Quick Start

Text continuation (recommended usage)

python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

BASE_MODEL = "HuggingFaceTB/SmolLM2-1.7B"
PEFT_MODEL = "ZygAI/ZygAI-OSS-2B-Encyclopedia"

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
model = AutoModelForCausalLM.from_pretrained(BASE_MODEL, torch_dtype=torch.float16)
model = PeftModel.from_pretrained(model, PEFT_MODEL)
model.eval()

prompt = "Vilnius is the capital of Lithuania, which"

inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=150,
        do_sample=False,
        repetition_penalty=1.3,
        no_repeat_ngram_size=3,
        pad_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Try the live demo

👉 ZygAI-Encyclopedia-2B-DEMO

Run locally (Gradio Space)

Clone and run the full demo interface on your own machine:

bash
# Clone the Space repository
git clone https://huggingface.co/spaces/ZygAI/ZygAI-Encyclopedia-2B-DEMO
cd ZygAI-Encyclopedia-2B-DEMO

# Create and activate a Python virtual environment
python -m venv env
source env/bin/activate  # Windows: env\Scripts\activate

# Install dependencies and launch
pip install -r requirements.txt
python app.py

Run with Docker (GPU)

bash
docker run -it -p 7860:7860 --platform=linux/amd64 --gpus all \
    registry.hf.space/zygai-zygai-encyclopedia-2b-demo:latest python app.py

Note: Review the Space code before running locally. GPU (--gpus all) is optional — the app falls back to CPU automatically.

📖 Intended Use

This model is designed for encyclopedic text continuation — given a factual opening phrase, it continues in the same structured, neutral style. It works best with prompts that are:

Factual and concise
Written in either Lithuanian or English
Styled like a Wikipedia sentence opening

Example prompts:

Albert Einstein was a physicist who
Vilnius yra Lietuvos sostinė, kuri
The Battle of Grunwald took place in
Kauno pilis yra vienas seniausių

⚙️ Training Details

Property	Value
Base model	HuggingFaceTB/SmolLM2-1.7B
Fine-tuning method	LoRA (PEFT) via TRL SFT
Training data	200,000+ Wikipedia paragraphs (LT + EN)
Hardware	NVIDIA A100 SXM (RunPod)
Precision	float16

Framework Versions

TRL: 1.5.1
Transformers: 5.10.2
PyTorch: 2.4.1+cu124
Datasets: 5.0.0
Tokenizers: 0.22.2
PEFT: latest

⚠️ Limitations

This is a research / experimental model, not a production assistant
At ~1.7B parameters, factual coverage is limited — hallucinations can occur
Best results with do_sample=False (greedy); sampling may produce incoherent output
Not suitable for instruction following or dialogue tasks
Lithuanian coverage is improving but still narrower than English

📜 Citation

If you use this model, please cite the base model and TRL:

bibtex
@software{vonwerra2020trl,
  title   = {{TRL: Transformers Reinforcement Learning}},
  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward
             and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif
             and Gallouédec, Quentin},
  license = {Apache-2.0},
  url     = {https://github.com/huggingface/trl},
  year    = {2020}
}

🔗 Related

ZygAI Platform — self-hosted AI chat platform by ZygMediaGroup
ZygAI on HuggingFace — all open source models
ZygMediaGroup

ZygAI-OSS-2B-Encyclopedia

Get help setting up a custom Dedicated Endpoints.

README