jason-cse151b-sft-lora API & Inference Endpoint

Hyperparameters

LoRA r = 64, alpha = 128, dropout = 0.05
target_modules = [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
5 epochs, LR 2e-4 cosine, warmup 5%
max_seq = 16384, BF16, gradient checkpointing
Effective batch size 8 (bsz=1 × grad_accum=8)
Training data: 737 SFT pairs (self-distill from K=32 SC + private hand-verified)

val_225 accuracy

After merging into base: 64.44 % (vs the 60 % QLoRA baseline).

Usage

python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Thinking-2507", dtype=torch.bfloat16, device_map="auto",
    trust_remote_code=True,
)
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B-Thinking-2507", trust_remote_code=True)
model = PeftModel.from_pretrained(base, "JaasonYuu/jason-cse151b-sft-lora")

Hyperparameters

LoRA r = 64, alpha = 128, dropout = 0.05
target_modules = [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
5 epochs, LR 2e-4 cosine, warmup 5%
max_seq = 16384, BF16, gradient checkpointing
Effective batch size 8 (bsz=1 × grad_accum=8)
Training data: 737 SFT pairs (self-distill from K=32 SC + private hand-verified)

val_225 accuracy

After merging into base: 64.44 % (vs the 60 % QLoRA baseline).

Usage

python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-4B-Thinking-2507", dtype=torch.bfloat16, device_map="auto",
    trust_remote_code=True,
)
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B-Thinking-2507", trust_remote_code=True)
model = PeftModel.from_pretrained(base, "JaasonYuu/jason-cse151b-sft-lora")

jason-cse151b-sft-lora

README

Hyperparameters

val_225 accuracy

Usage

See also

Explore FriendliAI today

README

Hyperparameters

val_225 accuracy

Usage

See also