mlx-community
Huihui-gemma-4-12B-coder-fable5-composer2.5-v1-abliterated-4bit-msq
Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
License: apache-2.0Details
- Type: Vision (VLM)
- Average: 4.45 bits per weight
- Method: MLX Smart Quantize (MSQ)
- AWQ scaling: applied to 96 groups
Model provider
mlx-community
Model tree
Base
yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1
Quantized
this model
Modalities
Input
Video, Audio, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information