What is Exegetical Generation?
Unlike translation (1:1 semantic mapping), exegetical commentary expands a source text 5–20× by recovering:
- Implicit definitions of technical terms
- Philosophical context and doctrinal significance
- Cross-references to other texts in the tradition
- The commentator's interpretive framework
This model learns to produce such commentary in the style of Mark Dyczkowski's Kashmir Śaivism scholarship.
Experiment Results
Table with columns: System, Info Gain (G), Expansion Ratio, Notes| System | Info Gain (G) | Expansion Ratio | Notes |
|---|
| B2: Claude Haiku zero-shot | 77 | 128.6× | Fluent but ungrounded |
| B3.1: Claude RAG hybrid | 94 | 132.6× | MW dictionary grounding, best overall |
| B4-fs: Nova Micro few-shot | 25 | 81.8× | Small model baseline |
| B4-ft: This model | 17 | 56.0× | Style transfer works, metric undercounts Devanāgarī |
The low G score is partly an artifact — the information gain metric undercounts inline Devanāgarī terms (e.g., "Kubjikā (कुब्जिका)") which are a distinctive feature of Dyczkowski's style that this model successfully reproduces.
Training Details
Table with columns: Parameter, Value| Parameter | Value |
|---|
| Base model | Qwen/Qwen2.5-14B-Instruct |
| Method | QLoRA (4-bit NF4) |
| Rank / Alpha | 16 / 32 |
| Trainable params | 40M / 7.6B total (0.53%) |
| Training data | 704 pairs (28 OCR + 676 lecture) |
| Validation | 88 pairs |
| Test | 89 pairs |
| Epochs | 3 (264 steps) |
| Training time |
Training Data Sources
- OCR-extracted pairs (28) — Verse-commentary alignments from Tantrāloka Volume 1, pages 15-94. Extracted via Chandra OCR-2 (5.3B) on SageMaker.
- Lecture term-explanation pairs (676) — Sanskrit terms with contextual explanations from 24 Kubjikā/Paścimāmnāya lectures. Transcribed via Whisper large-v3, cleaned, and corrected with Claude Haiku 4.5.
Training data derived from Anamavajra-Labs/tantraloka-dyczkowski-raw.
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-14B-Instruct",
quantization_config=BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
),
device_map="auto",
)
model = PeftModel.from_pretrained(base, "Anamavajra-Labs/exegen-qwen14b-lora")
tokenizer = AutoTokenizer.from_pretrained("Anamavajra-Labs/exegen-qwen14b-lora")
prompt = """<|im_start|>system
You are a scholar of Kashmir Śaivism and Sanskrit philosophy specializing in the Tantrāloka of Abhinavagupta. Generate detailed exegetical commentary that explains the philosophical significance, technical terminology, and doctrinal context.<|im_end|>
<|im_start|>user
Generate exegetical commentary for: śaktipāta (शक्तिपात)<|im_end|>
<|im_start|>assistant
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
do_sample=True,
repetition_penalty=1.15,
)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Hardware requirements: ~12 GB VRAM (4-bit quantized). Runs on a single RTX 3090/4090 or T4.
Example Output
Input: kuṇḍalinī (कुण्डलिनी)
Output (excerpt):
Kuṇḍalinī (कुण्डलिनी) is the coiled energy that resides at the base of the central channel (suṣumnā). In the Kubjikā (कुब्जिका) tradition of the Paścimāmnāya (पश्चिमाम्नाय), she is identified with the goddess herself — the crooked one, bent with the weight of manifestation. Abhinavagupta in the Tantrāloka (तन्त्रालोक) describes her awakening through śaktipāta (शक्तिपात), the descent of grace, which occurs through the guru's transmission...
Note the characteristic Dyczkowski-style inline Devanāgarī annotations and cross-references to Kubjikā and Paścimāmnāya traditions.
Qualitative Observations
The model successfully learns:
- Inline Devanāgarī — "mantra (मन्त्र)", "Kubjikā (कुब्जिका)" style annotations
- Tradition-specific framing — References to Kubjikā, Kula, Paścimāmnāya, Krama
- Commentarial voice — Adopts Dyczkowski's oral teaching register
- Cross-referencing — Spontaneous references to related texts and practices
Known Limitations
- Small training set (704 pairs) — repetition loops on some inputs (use
repetition_penalty=1.15)
- Domain narrow — Primarily Kashmir Śaivism / Kubjikā tradition; limited coverage of other darśanas
- Hallucinated terms — Occasionally generates plausible but incorrect Sanskrit compounds
- No retrieval — Pure generation without grounding; B3.1 (RAG) scores higher on factual accuracy
Next Steps
119 additional Tantraloka lectures (chapters 1-3) have been transcribed and are available in the dataset. Extracting verse-commentary pairs from these will expand the training set from 704 to potentially 2000+ pairs for the next fine-tuning iteration.
Citation
@misc{exegen-qwen14b-lora,
title={ExeGen: Exegetical Generation as a New NLP Task},
author={Ovcharov, Vladimir and Tatarchenko, Igor},
year={2026},
publisher={Anamavajra Labs},
url={https://huggingface.co/Anamavajra-Labs/exegen-qwen14b-lora}
}
Organization
Anamavajra Labs — Sanskrit NLP & Contemplative Studies