S-ABISHEAK/Docmatix-Qwen2-VL-7B-Instruct-finetuned API & Inference Endpoint

Model Details

Developed by: [Your Name/Organization]
Model type: Multimodal Large Language Model (Vision-Language)
Language(s): English
Finetuned from model: unsloth/Qwen2-VL-7B-Instruct
Finetuning approach: LoRA (Low-Rank Adaptation)

Training Details

Training Data

Fine-tuned on the images split of the Docmatix dataset, which focuses on document understanding and visual question answering.

Training Hyperparameters

Method: SFT (Supervised Fine-Tuning) with LoRA
LoRA Rank (r): 8
LoRA Alpha: 16
Optimizer: AdamW (8-bit)
Learning Rate: 1e-4
Batch Size: 1 (with Gradient Accumulation steps: 8)
Max Steps: 200
Precision: fp16/bf16 (depending on hardware compatibility)

How to Get Started with the Model

Loading the LoRA Adapter

python
from unsloth import FastVisionModel
import torch

model, tokenizer = FastVisionModel.from_pretrained(
    "unsloth/Qwen2-VL-7B-Instruct",
    load_in_4bit=True,
)

model = FastVisionModel.load_adapter(model, "path_to_your_lora_files")

Inference Example

python
from transformers import TextStreamer

FastVisionModel.for_inference(model)

# Standard Qwen2-VL inference code follows...

Framework versions

PEFT 0.19.1
Unsloth 2026.5.8
Transformers 5.0.0
PyTorch 2.11.0

Model Details

Developed by: [Your Name/Organization]
Model type: Multimodal Large Language Model (Vision-Language)
Language(s): English
Finetuned from model: unsloth/Qwen2-VL-7B-Instruct
Finetuning approach: LoRA (Low-Rank Adaptation)

Training Details

Training Data

Fine-tuned on the images split of the Docmatix dataset, which focuses on document understanding and visual question answering.

Training Hyperparameters

Method: SFT (Supervised Fine-Tuning) with LoRA
LoRA Rank (r): 8
LoRA Alpha: 16
Optimizer: AdamW (8-bit)
Learning Rate: 1e-4
Batch Size: 1 (with Gradient Accumulation steps: 8)
Max Steps: 200
Precision: fp16/bf16 (depending on hardware compatibility)

How to Get Started with the Model

Loading the LoRA Adapter

python
from unsloth import FastVisionModel
import torch

model, tokenizer = FastVisionModel.from_pretrained(
    "unsloth/Qwen2-VL-7B-Instruct",
    load_in_4bit=True,
)

model = FastVisionModel.load_adapter(model, "path_to_your_lora_files")

Inference Example

python
from transformers import TextStreamer

FastVisionModel.for_inference(model)

# Standard Qwen2-VL inference code follows...

Framework versions

PEFT 0.19.1
Unsloth 2026.5.8
Transformers 5.0.0
PyTorch 2.11.0

Docmatix-Qwen2-VL-7B-Instruct-finetuned

Get help setting up a custom Dedicated Endpoints.

README

Model Details

Training Details

Training Data

Training Hyperparameters

How to Get Started with the Model

Loading the LoRA Adapter

Inference Example

Framework versions

Explore FriendliAI today

README

Model Details

Training Details

Training Data

Training Hyperparameters

How to Get Started with the Model

Loading the LoRA Adapter

Inference Example

Framework versions