ertghiu256

Qwen3.5-2b-ReMix

README

License: apache-2.0

🌟 Model Highlights

🏗️ Base Architecture: Qwen/Qwen3.5-2B (Dense, Hybrid Gated DeltaNet)
💾 Precision format: Native Float16 (F16) Merged Weights — No adapter required!
🎯 Main Goal: Advanced mathematical reasoning and complex code generation/debugging.
🛡️ Data Origin: 100% open-source distilled reasoning datasets natively hosted on Hugging Face. No proprietary data or closed APIs (OpenAI, Anthropic, Google) were used or involved in the collection or training process.
⚡ Target Environment: Local, high-efficiency edge execution with minimal hardware requirements.

🎛️ Recommended Generation Parameters

Depending on your use case, we recommend switching between "Everyday" and "Deep Reasoning" profiles to get the best performance out of the 2B architecture.

🏠 Everyday Use (Balanced)

Table with columns: Parameter, Value, Note
Parameter	Value	Note
🌡️ Temperature (`temp`)	`0.4`	Provides a balance of creativity and coherence.
🎯 Top K (`top_k`)	`30`	Limits vocabulary to the most probable next steps.
🔄 Repeat Penalty	`1.1`	Light penalty to ensure conversational flow.

🧠 Deep Reasoning

Table with columns: Parameter, Value, Note
Parameter	Value	Note
🌡️ Temperature (`temp`)	`0.0 - 0.1`	Forced determinism for strict logical consistency.
🎯 Top K (`top_k`)	`60`	Wider pool for complex technical vocabulary.
🔄 Repeat Penalty	`1.2`	Prevents "reasoning loops" during long chain-of-thought.

📊 Training & Merge Details

The model was adapted using Parameter-Efficient Fine-Tuning (PEFT) and then compiled back into the core network layers to output clean, unified F16 weights via Unsloth.

🔄 Training Steps: 175
📉 Loss Profile: Convergence floor reached ~0.58; stabilized consistently around 0.85
📈 Learning Rate: 4e-5
📐 LoRA Rank ( $R$ ) during training: 16
⚖️ LoRA Alpha ( $α$ ) during training: 32

⚠️ Limitations & Risks

While this fine-tune aggressively pushes the boundaries of what a 2B parameter model can achieve locally, users should carefully account for the following behaviors:

🔮 Hallucinations: Like all highly compact models, it can confidently present false calculations or flawed code as absolute facts. Always verify outputs.
🎭 Inconsistent Styles: Due to the "ReMix" nature of the training data, the model may occasionally exhibit shifting output structures or stylistic variations.
🛑 Logic Mismatches: For extremely niche programming or high-level academic proofs, the model may occasionally produce broken syntax or reverse its logical assertions.

📦 How to Use Natively

🐍 Using Hugging Face Transformers

python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_path = "YOUR_USERNAME/Qwen3.5-2B-ReMix"

# Load the aligned tokenizer and model weights directly
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path, 
    torch_dtype=torch.float16, 
    device_map="auto"
)

messages = [
    {"role": "user", "content": "Explain the logic of a quicksort algorithm and implement it in Python."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Using Reasoning Parameters (To not overthink)
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1024,
    temperature=0.1,    
    top_k=60,           
    repeat_penalty=1.2  
)

Uploaded finetuned model

Developed by: ertghiu256
License: apache-2.0
Finetuned from model : unsloth/Qwen3.5-2B

This qwen3_5 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Available on FriendliAI

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Model Details

Model Provider

ertghiu256

Model Tree

Base

Qwen/Qwen3.5-2B

Fine-tuned

this model

Input Modalities

Text

Image

Video

Output Modalities