Anamavajra-Labs

exegen-qwen14b-lora

Deploy Dedicated

README

License: apache-2.0

What is Exegetical Generation?

Unlike translation (1:1 semantic mapping), exegetical commentary expands a source text 5–20× by recovering:

Implicit definitions of technical terms
Philosophical context and doctrinal significance
Cross-references to other texts in the tradition
The commentator's interpretive framework

This model learns to produce such commentary in the style of Mark Dyczkowski's Kashmir Śaivism scholarship.

Experiment Results

Table with columns: System, Info Gain (G), Expansion Ratio, Notes
System	Info Gain (G)	Expansion Ratio	Notes
B2: Claude Haiku zero-shot	77	128.6×	Fluent but ungrounded
B3.1: Claude RAG hybrid	94	132.6×	MW dictionary grounding, best overall
B4-fs: Nova Micro few-shot	25	81.8×	Small model baseline
B4-ft: This model	17	56.0×	Style transfer works, metric undercounts Devanāgarī

The low G score is partly an artifact — the information gain metric undercounts inline Devanāgarī terms (e.g., "Kubjikā (कुब्जिका)") which are a distinctive feature of Dyczkowski's style that this model successfully reproduces.

Training Details

Table with columns: Parameter, Value
Parameter	Value
Base model	Qwen/Qwen2.5-14B-Instruct
Method	QLoRA (4-bit NF4)
Rank / Alpha	16 / 32
Trainable params	40M / 7.6B total (0.53%)
Training data	704 pairs (28 OCR + 676 lecture)
Validation	88 pairs
Test	89 pairs
Epochs	3 (264 steps)
Training time

Training Data Sources

OCR-extracted pairs (28) — Verse-commentary alignments from Tantrāloka Volume 1, pages 15-94. Extracted via Chandra OCR-2 (5.3B) on SageMaker.
Lecture term-explanation pairs (676) — Sanskrit terms with contextual explanations from 24 Kubjikā/Paścimāmnāya lectures. Transcribed via Whisper large-v3, cleaned, and corrected with Claude Haiku 4.5.

Training data derived from Anamavajra-Labs/tantraloka-dyczkowski-raw.

Usage

python
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-14B-Instruct",
    quantization_config=BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_compute_dtype=torch.bfloat16,
    ),
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "Anamavajra-Labs/exegen-qwen14b-lora")
tokenizer = AutoTokenizer.from_pretrained("Anamavajra-Labs/exegen-qwen14b-lora")

prompt = """<|im_start|>system
You are a scholar of Kashmir Śaivism and Sanskrit philosophy specializing in the Tantrāloka of Abhinavagupta. Generate detailed exegetical commentary that explains the philosophical significance, technical terminology, and doctrinal context.<|im_end|>
<|im_start|>user
Generate exegetical commentary for: śaktipāta (शक्तिपात)<|im_end|>
<|im_start|>assistant
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    do_sample=True,
    repetition_penalty=1.15,
)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Hardware requirements: ~12 GB VRAM (4-bit quantized). Runs on a single RTX 3090/4090 or T4.

Example Output

Input: kuṇḍalinī (कुण्डलिनी)

Output (excerpt):

Kuṇḍalinī (कुण्डलिनी) is the coiled energy that resides at the base of the central channel (suṣumnā). In the Kubjikā (कुब्जिका) tradition of the Paścimāmnāya (पश्चिमाम्नाय), she is identified with the goddess herself — the crooked one, bent with the weight of manifestation. Abhinavagupta in the Tantrāloka (तन्त्रालोक) describes her awakening through śaktipāta (शक्तिपात), the descent of grace, which occurs through the guru's transmission...

Note the characteristic Dyczkowski-style inline Devanāgarī annotations and cross-references to Kubjikā and Paścimāmnāya traditions.

Qualitative Observations

The model successfully learns:

Inline Devanāgarī — "mantra (मन्त्र)", "Kubjikā (कुब्जिका)" style annotations
Tradition-specific framing — References to Kubjikā, Kula, Paścimāmnāya, Krama
Commentarial voice — Adopts Dyczkowski's oral teaching register
Cross-referencing — Spontaneous references to related texts and practices

Known Limitations

Small training set (704 pairs) — repetition loops on some inputs (use repetition_penalty=1.15)
Domain narrow — Primarily Kashmir Śaivism / Kubjikā tradition; limited coverage of other darśanas
Hallucinated terms — Occasionally generates plausible but incorrect Sanskrit compounds
No retrieval — Pure generation without grounding; B3.1 (RAG) scores higher on factual accuracy

Next Steps

119 additional Tantraloka lectures (chapters 1-3) have been transcribed and are available in the dataset. Extracting verse-commentary pairs from these will expand the training set from 704 to potentially 2000+ pairs for the next fine-tuning iteration.

Citation

bibtex
@misc{exegen-qwen14b-lora,
  title={ExeGen: Exegetical Generation as a New NLP Task},
  author={Ovcharov, Vladimir and Tatarchenko, Igor},
  year={2026},
  publisher={Anamavajra Labs},
  url={https://huggingface.co/Anamavajra-Labs/exegen-qwen14b-lora}
}

Organization

Anamavajra Labs — Sanskrit NLP & Contemplative Studies

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Model Details

Model Provider

Anamavajra-Labs

Model Tree

Base

Qwen/Qwen2.5-14B-Instruct

Adapter

this model

Input Modalities

Text

Output Modalities