XLEB985/mistral-board-hard-character-lora API & Inference Endpoint

Intended Runtime Profile

json
{
  "adapter_scale": 1.15,
  "temperature": 0.72,
  "top_p": 0.86,
  "top_k": 60,
  "repetition_penalty": 1.08,
  "no_repeat_ngram_size": 4,
  "max_new_tokens": 260,
  "min_new_tokens": 0,
  "max_context_tokens": 3072,
  "primer": "hard",
  "user_wrapper": "board-hard",
  "assistant_prefix": ""
}

The behavior depends on the runtime wrapper in chat_lora.py. If you load only the adapter in a generic chat UI, it may become softer or more assistant-like.

Local Python Usage

powershell
cd F:\mistral-board-training
powershell -ExecutionPolicy Bypass -File .\scripts\start_chat_lora_hard.ps1

Minimal PEFT Example

python
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

base = "mistralai/Mistral-7B-Instruct-v0.3"
adapter = "YOUR_USERNAME/mistral-board-hard-character-lora"

quant = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.float16,
)

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    quantization_config=quant,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()

Prompting Note

For best behavior, prompt it like a thread post rather than like a polite assistant. The local script wraps user input with a hard character instruction and uses the saved sampling settings.

Jan / LM Studio

Jan and LM Studio usually work best with GGUF. See JAN_LMSTUDIO_GGUF_GUIDE_RU.md and export_hard_character_gguf.ps1.

Files

adapter_model.safetensors - LoRA adapter weights
adapter_config.json - PEFT config
HARD_CHARACTER_SETTINGS.json - saved runtime settings
chat_lora.py - local chat runner with hard wrapper/primer
README_HARD_CHARACTER_RU.md - Russian local usage notes
JAN_LMSTUDIO_GGUF_GUIDE_RU.md - Russian Jan/LM Studio guide

mistral-board-hard-character-lora

Get help setting up a custom Dedicated Endpoints.

README