Alexdarkstalker228/justmodel API & Inference Endpoint

Training procedure

SFT Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 16
total_train_batch_size: 256
total_eval_batch_size: 8
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 2.0

GRPO Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-06
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 128
total_eval_batch_size: 8
ppo_mini_batch_size: 32
ppo_micro_batch_size_per_gpu: 20
kl_loss_coef: 0.001
lr_scheduler_warmup_steps: 10
num_epochs: 2.0

Usage

For quick start, please see MindIntLab-HFUT/Psyche-R1 on GitHub.

Citation

If this work is helpful, please kindly cite as:

bibtex
@article{dai2025psyche,
  title={Psyche-R1: Towards Reliable Psychological LLMs through Unified Empathy, Expertise, and Reasoning},
  author={Dai, Chongyuan and Hu, Jinpeng and Shi, Hongchang and Li, Zhuo and Yang, Xun and Wang, Meng},
  journal={arXiv preprint arXiv:2508.10848},
  year={2025}
}

Training procedure

SFT Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 16
total_train_batch_size: 256
total_eval_batch_size: 8
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 2.0

GRPO Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-06
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 128
total_eval_batch_size: 8
ppo_mini_batch_size: 32
ppo_micro_batch_size_per_gpu: 20
kl_loss_coef: 0.001
lr_scheduler_warmup_steps: 10
num_epochs: 2.0

Usage

For quick start, please see MindIntLab-HFUT/Psyche-R1 on GitHub.

Citation

If this work is helpful, please kindly cite as:

bibtex
@article{dai2025psyche,
  title={Psyche-R1: Towards Reliable Psychological LLMs through Unified Empathy, Expertise, and Reasoning},
  author={Dai, Chongyuan and Hu, Jinpeng and Shi, Hongchang and Li, Zhuo and Yang, Xun and Wang, Meng},
  journal={arXiv preprint arXiv:2508.10848},
  year={2025}
}

justmodel

Get help setting up a custom Dedicated Endpoints.

README

Training procedure

SFT Training hyperparameters

GRPO Training hyperparameters

Usage

Citation

Explore FriendliAI today

README

Training procedure

SFT Training hyperparameters

GRPO Training hyperparameters

Usage

Citation