Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: other

Intended purpose

This LoRA is kept and published as an experimental branch for:

  • code explanation
  • learning-oriented coding assistance
  • understanding programming problems
  • step-by-step reasoning around code
  • comparing OpenCodeInstruct-style behavior against a stricter code-output model

It is not ideal for:

  • agentic coding
  • test-driven code generation
  • benchmark-style exact function output
  • tools that require the model to return only executable code
  • coding agents that must avoid prose/explanation unless asked

Why this is not the main version

A small local before/after Python code benchmark showed that this OpenCodeInstruct continuation reduced exact-code benchmark performance.

ModelAdapterPassedPass rateAvg tokens/s
Beforeheretic_F_lora_python5000_codefeedback50009/1090.00%8.38
AfterSAFE_OPENCODE_5000_1024_20260607_1533276/1060.00%8.41

Delta:

MetricValue
Passes-3
Pass rate-30.00%
Avg tokens/s+0.03

The post-training adapter was worse on strict executable-code tasks, especially when the expected output was a compact Python function or class.

However, this does not mean the adapter is useless. It likely shifted behavior toward a more explanatory, learning-oriented style. That can be useful for users who want to understand code, reason through tasks, or receive more guided programming explanations.

Training configuration

ItemValue
Base modelJoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback
Adapter inputheretic_F_lora_python5000_codefeedback5000
Datasetnvidia/OpenCodeInstruct
Samples used5,000
Sequence length1024
Epochs1
Learning rate5e-6
Training methodQLoRA / LoRA
Quantized loading during training4-bit NF4

Training result

MetricValue
Train runtime6258 seconds
Runtime1h 44m 18s
Samples/second0.799
Steps/second0.1
Final train loss0.3913
First logged loss0.6957
Last logged loss0.3623
Minimum logged loss0.3441

The training run completed successfully after reducing sequence length and using a more conservative GPU configuration.

Benchmark files

The local benchmark artifacts are included in this repository under:

text

benchmark/

Files:

text

benchmark/before_summary.md
benchmark/after_summary.md
benchmark/COMPARISON.md
benchmark/comparison.json
benchmark/before_results.jsonl
benchmark/after_results.jsonl

Recommended usage

Use this adapter when you want a model that may be more comfortable explaining code and reasoning through programming tasks.

For stricter agentic coding or benchmark-style executable output, prefer the original merged CodeFeedback model:

JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback

Loading example

python

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
base_model = "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback"
adapter = "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-OpenCodeInstruct-Learning-LoRA"
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter)
model.eval()

Important notes

This is an experimental LoRA adapter.

It should not be treated as a universal improvement over the previous CodeFeedback model. It is published for transparency, comparison, and reproducibility.

The benchmark results suggest that it is worse for strict agentic coding, but potentially useful for learning-oriented coding assistance.

Model provider

JoaoZaokk

Model tree

Base

JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback

Adapter

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today