Hyperparameters
{
"adapter_repo": "herooooooooo/first-hf-run-pi-mono-gemma4-e2b-adapter-lr2e-4-r16-alpha32",
"base_model": "google/gemma-4-E2B-it",
"gradient_accumulation_steps": 16,
"learning_rate": 0.0002,
"lora_alpha": 32,
"lora_dropout": 0.05,
"lora_r": 16,
"max_length": 768,
"max_steps": 400,
"per_device_train_batch_size": 1,
"run_name": "lr2e-4-r16-alpha32"
}
Dataset Conversion
{
"conversion_notes": [
"Recovery eval reused the same session-level held-out split as training.",
"Assistant thinking blocks and thinking signatures are stripped by the converter.",
"Prompt context is left-truncated during manual loss scoring so assistant tokens remain scored."
],
"dataset_id": "badlogicgames/pi-mono",
"eval_examples": 800,
"eval_fraction_by_session": 0.1,
"eval_session_files": 31,
"max_eval_examples": 800,
"seed": 42,
"total_session_files_seen": 626,
"train_session_files": 563
}
The source dataset was already exported with pi-share-hf; this training job
does not read local raw traces or secrets. The converter strips assistant
thinking/signatures and omits non-text payloads for this text-only SFT run.
Recovery Eval Note
This adapter was already pushed by the original sweep job. A follow-up adapter-only recovery job recomputed held-out loss manually because the original job crashed in the Trackio callback after final evaluation.