meta-llama
Llama-4-Scout-17B-16E-Instruct
Serverless Endpoints
Run this model inference with a simple API call.
Dedicated Endpoints
Run this model inference on single tenant GPU with unmatched speed and reliability at scale.
Container
Run this model inference with full control and performance in your environment.
API Example
Model provider
meta-llama
Model tree
Base
meta-llama/Llama-4-Scout-17B-16E
Fine-tuned
this model
Modalities
Input
Text, Image
Output
Text
Pricing
Serverless Endpoints
$0.002 / second
Dedicated Endpoints
View detailsSupported Functionality
Serverless Endpoints
Dedicated Endpoints
Container
More information