nightmedia
Qwen3.6-35B-A3B-Qwable-Holo3-Qwopus
Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Get help setting up a custom Dedicated Endpoints.
Talk with our engineer to get a quote for reserved GPU instances with discounts.
README
Use with mlx
bash
pip install mlx-lm
python
from mlx_lm import load, generatemodel, tokenizer = load("Qwen3.6-35B-A3B-Qwable-Holo3-Qwopus")prompt = "hello"if tokenizer.chat_template is not None:messages = [{"role": "user", "content": prompt}]prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_dict=False,)response = generate(model, tokenizer, prompt=prompt, verbose=True)
Model provider
nightmedia
Model tree
Base
lordx64/Qwable-v1
Fine-tuned
this model
Modalities
Input
Video, Text, Image
Output
Text
Pricing
Dedicated Endpoints
View detailsSupported Functionality
Model APIs
Dedicated Endpoints
Container
More information