JoaoZaokk

Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-Agentic-Tessa-1K-LoRA

README

License: other

Status

This is a candidate / experimental adapter, not a claimed major improvement.

I'll be testing some datasets to make the model better for coding, it a tiny improvement, not a game changer, but compared to the previous one this model didn't get worse.

In a small local Python coding benchmark, this adapter preserved the previous score:

Table with columns: Model, Adapter, Passed, Pass rate, Avg tokens/s
Model	Adapter	Passed	Pass rate	Avg tokens/s
Before	`heretic_F_lora_python5000_codefeedback5000`	9/10	90.00%	7.80
After	`heretic_F_lora_tessa_agentic_1000_test`	9/10	90.00%	7.86

Delta:

Table with columns: Metric, Value
Metric	Value
Passes	0
Pass rate	0.00%
Avg tokens/s	+0.05

Unlike the OpenCodeInstruct continuation experiment, this Tessa-based adapter did not regress on the small strict-code benchmark.

Training configuration

Table with columns: Item, Value
Item	Value
Base model	`JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback`
Input adapter	`heretic_F_lora_python5000_codefeedback5000`
Dataset	`smirki/Agentic-Coding-Tessa`
Samples used	1,000
Sequence length	1024
Epochs	1
Learning rate	1e-6

Benchmark files

Benchmark artifacts are included under:

text
benchmark/

Files:

text
benchmark/before_summary.md
benchmark/after_summary.md
benchmark/COMPARISON.md
benchmark/before_results.jsonl
benchmark/after_results.jsonl

Intended use

This adapter is intended for testing:

agentic coding behavior
coding assistance
code generation
code explanation
tool-use style coding responses
continued fine-tuning experiments

It should be compared against the main CodeFeedback model before use in any serious coding workflow.

Loading example

python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

base_model = "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback"
adapter = "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-Agentic-Tessa-1K-LoRA"

tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
)

model = PeftModel.from_pretrained(model, adapter)
model.eval()

Important notes

This is an experimental LoRA adapter.

The benchmark used here is small and should not be treated as a formal coding leaderboard. It is mainly useful for local before/after regression testing.

This adapter preserved the current local benchmark score, but further testing is needed before treating it as a better general-purpose coding model.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Model Details

Model Provider

JoaoZaokk

Model Tree

Base

JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback

Adapter

this model

Input Modalities

Text

Output Modalities