amkhrjee/pg-chat API & Inference Endpoint

Model Details

Base model: unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit
Fine-tuning method: LoRA (PEFT)
Rank (r): 8
Alpha: 32
Dropout: 0.1
Trainer: TRL SFTTrainer

Dataset

Training data:

pookie3000/pg_chat

The dataset consists of conversational user-assistant exchanges formatted as chat messages.

Usage

python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

BASE_MODEL = "unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit"
ADAPTER = "amkhrjee/pg-chat"

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)

base_model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL,
    device_map="auto",
)

model = PeftModel.from_pretrained(
    base_model,
    ADAPTER,
)

messages = [
    {
        "role": "user",
        "content": "What advice would you give to a young founder?"
    }
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7,
    do_sample=True,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

Responses do not represent the actual views of Paul Graham.
The model may generate inaccurate or fabricated information.
The training dataset is relatively small and may not cover all topics consistently.
This model inherits the capabilities and limitations of the base Qwen3 model.

Training Configuration

Parameter	Value
Base Model	`unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit`
LoRA Rank	64
LoRA Alpha	32
LoRA Dropout	0.1
Epochs	10

License

This repository contains only LoRA adapter weights.

Please refer to the licenses of the base model and training dataset for applicable terms.

pg-chat

Get help setting up a custom Dedicated Endpoints.

README