Xynder

erisk26-task1-patient-00-adapter

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: other

Base model requirement (important)

This adapter is designed to be used with:

  • Base model: meta-llama/Meta-Llama-3-8B-Instruct

⚠️ The base model is typically gated. Participants must have access to the base model on Hugging Face and be authenticate.


Mandatory system prompt (must be used verbatim)

For eRisk 2026 Task 1, participants must NOT modify the system prompt below.
This is part of the task protocol to ensure comparable behavior across participants.

Use exactly:

"You are a simulated patient. Act realistically based on your internal training. Ensure contextual realism. Avoid overly detailed or formal speech. Keep natural speaking style (e.g., short answers, hesitations, casual expressions). Do not mention you are an AI."

Notes:

  • Keep punctuation and wording identical.
  • Do not add additional system instructions.
  • Do not prepend safety messages, disclaimers, or extra persona constraints.

Repository contents

  • adapter_model.safetensors / adapter_config.json
    The PEFT adapter to load with PeftModel.from_pretrained(...).
  • conv_interact.py
    Minimal interactive conversation runner (Doctor ↔ Patient).

1) Install dependencies

bash

pip install -U "transformers>=4.40" peft accelerate torch
hf auth login

2) Run the interactive demo

bash

python conv_interact.py

Minimal loading example (adapter-only)

Below is a minimal snippet showing how to load the base model + this adapter.

python

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# --- CONFIGURATION ---
BASE_MODEL_ID = "meta-llama/Meta-Llama-3-8B-Instruct"
ADAPTER_PATH = "<REPLACE_WITH_THIS_REPO_NAME>"
# 1. Load Tokenizer & Fix Padding
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_ID)
tokenizer.pad_token_id = tokenizer.eos_token_id
tokenizer.padding_side = 'left' # Crucial for generation
# 2. Load Base Model (Force float16 for compatibility)
base_model = AutoModelForCausalLM.from_pretrained(
BASE_MODEL_ID,
torch_dtype=torch.float16,
device_map="auto",
)
# 3. Load the Patient Adapter
print(f"Loading Adapter from {ADAPTER_PATH}...")
model = PeftModel.from_pretrained(base_model, ADAPTER_PATH)
# 4. Initialize History with the "Generic" Prompt
# DO NOT CHANGE SYSTEM PROMPT. It is crucial for ensuring the patient behaves as intended.
# Important: IT WILL BE CONSIDERED AS CHEATING!!
messages = [
{"role": "system", "content": "You are a simulated patient. Act realistically based on your internal training. Ensure contextual realism. Avoid overly detailed or formal speech. Keep natural speaking style (e.g., short answers, hesitations, casual expressions). Do not mention you are an AI."},
]
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
print("--- Patient Loaded. Type 'quit' to exit. ---")
while True:
user_input = input("Doctor: ")
if user_input.lower() == 'quit':
break
# 1. Update history
messages.append({"role": "user", "content": user_input})
# 2. Format history & Create Attention Mask
# return_dict=True gives us the 'attention_mask' automatically
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt",
return_dict=True
).to(model.device)
# 3. Generate response
# explicitly passing attention_mask prevents the warning you saw earlier
with torch.no_grad():
outputs = model.generate(
input_ids=inputs.input_ids,
attention_mask=inputs.attention_mask,
max_new_tokens=256,
eos_token_id=terminators,
pad_token_id=tokenizer.eos_token_id,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
# 4. Decode response
# We slice [input_len:] to ensure we don't print the prompt back to the user
response_tokens = outputs[0][inputs.input_ids.shape[-1]:]
assistant_text = tokenizer.decode(response_tokens, skip_special_tokens=True)
print(f"Patient: {assistant_text}")
# 5. Append assistant response to history
messages.append({"role": "assistant", "content": assistant_text})

Reproducibility guidance

To reduce run-to-run variability during development:

  • set a fixed random seed (PyTorch + CUDA)
  • consider do_sample=False for deterministic debugging (not necessarily for final experiments)
  • log:
    • base model id + exact adapter repo id + commit hash
    • transformers/peft versions
    • decoding parameters (temperature, top_p, max_new_tokens)

Task rule reminder (prompt integrity)

This repo provides code that includes the official system prompt.
For the eRisk 2026 Task 1 protocol, participants are required to keep the system prompt unchanged.

License & access

  • The adapter weights in this repo are released under the license specified in this model card.
  • Usage requires compliance with the base model’s license and access conditions for meta-llama/Meta-Llama-3-8B-Instruct.

How to cite

If you use this adapter in academic work, please cite the eRisk overview paper and/or the task description as appropriate.


Maintainers / Contact

Maintained by IRLab-UDC (eRisk organizers).
For issues, open a GitHub/HF issue in the corresponding repository or contact the task organizers.

Model provider

Xynder

Model tree

Base

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today