ewald1976
g4-12b-it-trismegistus
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Model
- Base model:
unsloth/gemma-4-12b-it - Method: LoRA fine-tune, merged into the base weights
- Dataset:
meseca/trismegistus-5k-v0.1
Intended use
This model is fine-tuned on meseca/trismegistus-5k-v0.1, a 5k subset of teknium's Trismegistus Project: a synthetically (GPT-4) generated instruction dataset covering esoterica in a broad sense — mysticism, hermeticism, religion, meditation, magick, spirituality, alchemy, numerology, tarot, and related topics. As a result, the model leans toward esoteric, occult, and spiritual subject matter and answers such prompts in an engaged, in-domain style rather than a detached, encyclopedic one. It is best suited for creative and exploratory work in these areas (worldbuilding, thematic writing, conversational exploration of esoteric concepts).
Limitations
The training data is fully synthetic; content is not factually authoritative and should not be treated as reference material. The esoteric focus shifts the base model's tone and may reduce its neutrality on these topics. General-purpose instruction-following capability from the base model is largely retained but was not the training target here.
Usage
python
from transformers import AutoModelForCausalLM, AutoTokenizerimport torchmodel_id = "ewald1976/g4-12b-it-trismegistus"tokenizer = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained(model_id,torch_dtype=torch.bfloat16,device_map="auto",)messages = [{"role": "user", "content": "..."}]inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)out = model.generate(input_ids=inputs, max_new_tokens=256)print(tokenizer.decode(out[0]))
Training parameters
| Parameter | Value |
|---|---|
| Epochs | 3 |
| Batch size | 2 |
| Learning rate | 2e-4 |
| Optimizer | AdamW 8-bit |
| Max steps | 0 (disabled; epochs control training length) |
| Context length | 4096 |
| Warmup steps | 5 |
LoRA (pre-merge)
| Parameter | Value |
|---|---|
| Rank | 32 |
| Alpha | 32 |
| Dropout | 0.05 |
| Variant | lora |
Frameworks
- Unsloth
- TRL / SFTTrainer
Model provider
ewald1976
Model tree
Base
unsloth/gemma-4-12b-it
Fine-tuned
this model
Modalities
Input
Video, Audio, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information