Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more
Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

README

License: other

Base model

ItemValue
Base modelJoaoZaokk/Qwen3-4B-Thinking-2507-MiniMax-M2.1-Distill-heretic
Architecture familyQwen3
Parameter count4B
FormatHugging Face Transformers / safetensors
Tensor typeF16
Fine-tuning methodQLoRA / LoRA
Final stateMerged model

Training datasets

DatasetSamples usedNotes
iamtarun/python_code_instructions_18k_alpaca5,000Python instruction/code examples
m-a-p/CodeFeedback-Filtered-Instruction5,000Code instruction and feedback examples

A SWE-smith trajectory experiment was tested separately, but it was not used in this final merged version.

LoRA configuration

ParameterValue
LoRA rank16
LoRA alpha32
LoRA dropout0.05
Sequence length2048
Epochs per stage1
Quantized loading4-bit NF4
Trainable parameters~33M
Trainable percentage~0.81%

Target modules:

  • q_proj
  • k_proj
  • v_proj
  • o_proj
  • gate_proj
  • up_proj
  • down_proj

Training stages

StageInput adapterDatasetOutput adapter
1Base modelPython instructions 5kheretic_F_lora_python_5000
2heretic_F_lora_python_5000CodeFeedback 5kheretic_F_lora_python5000_codefeedback5000
FinalBase model + final adapterMergeFull safetensors model

Training environment

ComponentVersion
Python3.11
PyTorch2.11.0+cu128
CUDA12.8
Transformers5.10.2
Datasets5.0.0
Accelerate1.13.0
PEFT0.19.1
bitsandbytes0.49.2
sentencepiece0.2.1
tiktoken0.13.0
protobuf7.35.0
pandas3.0.3
pyarrow24.0.0

Training GPU:

  • NVIDIA GeForce RTX 3080 Ti 12 GB

Intended use

This model is intended for local experimentation with:

  • Python code generation
  • code explanation
  • simple debugging
  • instruction-following tests
  • downstream conversion to GGUF, AWQ, GPTQ, or OpenVINO formats

Notes

This is an experimental model. It may produce incorrect code, unsafe suggestions, or hallucinated explanations. Outputs should be reviewed before use in production or security-sensitive environments.

Model provider

JoaoZaokk

Model tree

Base

JoaoZaokk/Qwen3-4B-Thinking-2507-MiniMax-M2.1-Distill-heretic

Fine-tuned

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Explore FriendliAI today