Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
Source
- Base model:
unsloth/Qwen3.5-35B-A3B - GGUF runtime package:
teru00801/hawks-qwen3_5-35b-a3b-gguf-0601
MLX conversion
Run this on Apple Silicon:
bash
mlx_lm.convert --model teru00801/hawks-qwen3_5-35b-a3b-merged-0601 -q --upload-repo <your-mlx-repo>
Model provider
teru00801
Model tree
Base
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information