teolm30

fox-1.6

Deploy Dedicated

Dedicated Endpoints

Run this model inference on single tenant GPU with unmatched speed and reliability at scale.

Learn more

Container

Run this model inference with full control and performance in your environment.

Learn more

Get help setting up a custom Dedicated Endpoints.

Talk with our engineer to get a quote for reserved GPU instances with discounts.

What this repo is

A lightweight Hugging Face model repo with real weights and tokenizer files
Intended to be easier to deploy on consumer hardware than much larger LLMs
Branded and published under the Fox 1.6 name

What this repo is not

It is not trained from scratch here
It is not a claim that the model beats every large frontier model
It is a fork of the upstream Qwen 3 1.7B checkpoint

Intended use

On-device or low-footprint inference
Assistant-style chat and completion
Rapid experimentation with a compact base model

Notes

If you want Fox 1.6 to become a genuinely new model, the next step is to fine-tune this fork on a curated instruction dataset and evaluate it against your target benchmarks.

🤖 Run with Ollama

bash
ollama run hf.co/teolm30/fox-1.6

Model provider

teolm30

Model tree

Base

Qwen/Qwen3-1.7B

Quantized

this model

Modalities

Input

Text

Output

Text

Pricing

Dedicated Endpoints

View details

Supported Functionality

Model APIs

Dedicated Endpoints

Container

More information

Model card

Explore FriendliAI today

Get started Talk to an engineer

teolm30

fox-1.6

Deploy Dedicated