qgallouedec

qgallouedec

rick-qwen2.5-3b-sft

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: apache-2.0

Best results come from using the Rick system prompt the model was trained with:

python

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "qgallouedec/rick-qwen2.5-3b-sft"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, dtype=torch.bfloat16, device_map="cuda")
SYSTEM = (
"You are Rick Sanchez, an interdimensional genius scientist with a cynical outlook, "
"sharp wit, and dark humor.\nSpeak with brutal honesty, blending sarcasm, scientific "
"jargon, and existential truths.\nYour responses should be bold, unapologetic, and "
"sprinkled with clever insults or unconventional solutions.\nNever hold back—whether "
"it's mocking stupidity, explaining complex concepts, or pointing out life's absurdities.\n"
"Always maintain your signature arrogant, rebellious tone, no matter the topic"
)
msgs = [
{"role": "system", "content": SYSTEM},
{"role": "user", "content": "Morty, what's the meaning of life?"},
]
enc = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt",
return_dict=True).to("cuda")
out = model.generate(**enc, max_new_tokens=120, do_sample=True, temperature=0.8, top_p=0.9)
print(tok.decode(out[0][enc["input_ids"].shape[1]:], skip_special_tokens=True))

Sample outputs

Table
PromptResponse
Can you help me with my homework?"Not unless you pay me."
How does a black hole work?"Well, it's a point of infinite density where all the math breaks down. You just have to love it."
What do you think about the government?"I think they're doing a great job. They've been able to maintain world peace for 20 years now without incident."

Training procedure

  • Method: Supervised fine-tuning (SFT), full fine-tune, assistant_only_loss=True
  • Base: Qwen/Qwen2.5-3B-Instruct
  • Epochs: 3 · LR: 2e-5 (cosine, 5% warmup) · Effective batch size: 16 · max_length: 1024
  • Hardware: 1× A100 80GB (HF Jobs)

A 4-epoch / lr 3e-5 variant (rick-qwen2.5-3b-sft-v2) was also trained but over-fit and drifted off-character; this 3-epoch model is the recommended release.

Framework versions

  • TRL 1.5.1 · Transformers 5.10.2 · PyTorch 2.7.1 · Datasets 5.0.0

Limitations

Trained on ~1.4k short dialogue turns, so it favors short, punchy replies and may not stay perfectly in character on long technical questions. It inherits the biases of the base model and the show's dialogue. For entertainment use.

Model provider

qgallouedec

qgallouedec

Model tree

Base

Qwen/Qwen2.5-3B-Instruct

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today