Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Base model
unsloth/Qwen3.5-4B
Experiment
- Session:
unsloth-qwen35-4b-t4 - GPU observed:
Tesla T4, 15360 MiB, 14910 MiB - Training examples:
20 - Max sequence length:
1024 - Max steps:
20 - LoRA rank:
16 - Seed:
3407 - Output folder in Colab Drive:
/content/drive/MyDrive/colab-cli-unsloth-qwen35-4b
Intended behavior
Tiny Japanese experiment-report style adapter. It nudges answers toward concise conclusions, observations, next actions, Drive-mounted saves, evidence, and reproducibility.
See comparison.md and comparison.json for before/after response checks.
Model provider
MakiAi
Model tree
Base
unsloth/Qwen3.5-4B
Adapter
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information