Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Run this model inference with full control and performance in your environment.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0✨ Highlights
- Bilingual — understands and continues text in both Lithuanian and English
- Encyclopedic style — dry, academic tone with dates, geographic terms, and structured prose
- Greedy Search optimised — designed for
do_sample=Falseto maximise factual accuracy - Lightweight — ~1.7B parameters, runs on consumer hardware (CPU or single GPU)
- Trained on A100 SXM via RunPod
🚀 Quick Start
Text continuation (recommended usage)
python
from transformers import AutoTokenizer, AutoModelForCausalLMfrom peft import PeftModelimport torchBASE_MODEL = "HuggingFaceTB/SmolLM2-1.7B"PEFT_MODEL = "ZygAI/ZygAI-OSS-2B-Encyclopedia"tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)model = AutoModelForCausalLM.from_pretrained(BASE_MODEL, torch_dtype=torch.float16)model = PeftModel.from_pretrained(model, PEFT_MODEL)model.eval()prompt = "Vilnius is the capital of Lithuania, which"inputs = tokenizer(prompt, return_tensors="pt")with torch.no_grad():outputs = model.generate(**inputs,max_new_tokens=150,do_sample=False,repetition_penalty=1.3,no_repeat_ngram_size=3,pad_token_id=tokenizer.eos_token_id,)print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Try the live demo
Run locally (Gradio Space)
Clone and run the full demo interface on your own machine:
bash
# Clone the Space repositorygit clone https://huggingface.co/spaces/ZygAI/ZygAI-Encyclopedia-2B-DEMOcd ZygAI-Encyclopedia-2B-DEMO# Create and activate a Python virtual environmentpython -m venv envsource env/bin/activate # Windows: env\Scripts\activate# Install dependencies and launchpip install -r requirements.txtpython app.py
Run with Docker (GPU)
bash
docker run -it -p 7860:7860 --platform=linux/amd64 --gpus all \registry.hf.space/zygai-zygai-encyclopedia-2b-demo:latest python app.py
Note: Review the Space code before running locally. GPU (
--gpus all) is optional — the app falls back to CPU automatically.
📖 Intended Use
This model is designed for encyclopedic text continuation — given a factual opening phrase, it continues in the same structured, neutral style. It works best with prompts that are:
- Factual and concise
- Written in either Lithuanian or English
- Styled like a Wikipedia sentence opening
Example prompts:
Albert Einstein was a physicist whoVilnius yra Lietuvos sostinė, kuriThe Battle of Grunwald took place inKauno pilis yra vienas seniausių
⚙️ Training Details
| Property | Value |
|---|---|
| Base model | HuggingFaceTB/SmolLM2-1.7B |
| Fine-tuning method | LoRA (PEFT) via TRL SFT |
| Training data | 200,000+ Wikipedia paragraphs (LT + EN) |
| Hardware | NVIDIA A100 SXM (RunPod) |
| Precision | float16 |
Framework Versions
- TRL: 1.5.1
- Transformers: 5.10.2
- PyTorch: 2.4.1+cu124
- Datasets: 5.0.0
- Tokenizers: 0.22.2
- PEFT: latest
⚠️ Limitations
- This is a research / experimental model, not a production assistant
- At ~1.7B parameters, factual coverage is limited — hallucinations can occur
- Best results with
do_sample=False(greedy); sampling may produce incoherent output - Not suitable for instruction following or dialogue tasks
- Lithuanian coverage is improving but still narrower than English
📜 Citation
If you use this model, please cite the base model and TRL:
bibtex
@software{vonwerra2020trl,title = {{TRL: Transformers Reinforcement Learning}},author = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edwardand Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashifand Gallouédec, Quentin},license = {Apache-2.0},url = {https://github.com/huggingface/trl},year = {2020}}
🔗 Related
- ZygAI Platform — self-hosted AI chat platform by ZygMediaGroup
- ZygAI on HuggingFace — all open source models
- ZygMediaGroup
Model provider
ZygAI
Model tree
Base
HuggingFaceTB/SmolLM2-1.7B
Adapter
this model
Modalities
Input
Text
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information