Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: mit

Model Details

  • Base model: microsoft/DialoGPT-large
  • Architecture: GPT2LMHeadModel
  • Task: text generation
  • Context length: 1024 tokens
  • Parameters: 774.0M
  • Evaluation perplexity: 8.6719

Model Comparison

Training

The fine-tuning run used the following setup:

  • Framework: Hugging Face Transformers
  • Training data: data/gpt-dialogues/train.txt; evaluation data: data/gpt-dialogues/dev.txt, built from DailyDialog CSV resources
  • Epochs: 4
  • Train/eval batch size per GPU: 1 / 1
  • Gradient accumulation steps: 6
  • Effective training batch size: 6
  • Learning rate: 1e-5
  • Max gradient norm: 1.0
  • Objective: line-by-line causal language modeling
  • Seed: 42
  • Checkpointing/logging: every 5000 optimizer steps; last checkpoint kept
  • Memory optimization: gradient checkpointing enabled

Training Format

Training examples use adjacent DailyDialog utterance pairs with explicit source and target emotion labels:

text

<bos><source_emotion>source utterance<sep><target_emotion>target utterance<|endoftext|>

Prompt Format

At generation time, the prompt should include the source utterance and the desired target emotion:

text

<bos><source_emotion>source utterance<sep><target_emotion>

Prompt and training tags:

  • <bos> marks the beginning of one formatted dialogue example.
  • <source_emotion> is a placeholder for one emotion label describing the input/source utterance, for example <fear>.
  • source utterance is the user/input text.
  • <sep> separates the source side from the response side.
  • <target_emotion> is a placeholder for the emotion you want the generated response to follow, for example <happiness>.
  • target utterance is the response text generated by the model.
  • <|endoftext|> marks the end of one example. GPT-2 uses this as its native end-of-text/eos token, and generation can stop when this token is produced.

Emotion conditioning: replace <source_emotion> and <target_emotion> in the template with one of the model's literal emotion tokens in each position.

Supported emotion labels:

  • <no emotion>
  • <anger>
  • <disgust>
  • <fear>
  • <happiness>
  • <sadness>
  • <surprise>

For example:

text

<bos><fear>I just started a new job and I am a bit nervous.<sep><happiness>

This means: the source utterance expresses fear, and the requested response should be conditioned toward happiness.

How to Use

python

from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "mario-rc/emotional-dialogpt-large"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(repo_id)
model.config.pad_token_id = tokenizer.pad_token_id
prompt = "<bos><fear>I just started a new job and I am a bit nervous.<sep><happiness>"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
do_sample=True,
max_new_tokens=80,
temperature=0.8,
top_p=0.95,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
)
generated = outputs[0][inputs["input_ids"].shape[-1]:]
response = tokenizer.decode(generated, skip_special_tokens=False)
response = response.split(tokenizer.eos_token, 1)[0].strip()
emotion_labels = [
"<no emotion>",
"<anger>",
"<disgust>",
"<fear>",
"<happiness>",
"<sadness>",
"<surprise>",
]
for label in emotion_labels:
if response.startswith(label):
response = response[len(label):].strip()
break
print(response)

Limitations

The model is intended for experimental dialogue/text generation. Generated text may be inaccurate, biased, repetitive, or emotionally inappropriate, and should be reviewed before user-facing use.

Model provider

mario-rc

Model tree

Base

microsoft/DialoGPT-large

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today