Deploy generative AI models with PeriFlow Cloud that runs PeriFlow, our flagship LLM serving engine.
See the guideline below to easily deploy any generative AI model on PeriFlow Cloud.
How it works
Create a deployment
From the PeriFlow web, you can create new deployments. Each deployment will handle the inference of an AI model.
Select your model
You can either upload your checkpoint or choose any of the models provided by PeriFlow.
Configure cloud resources
PeriFlow Cloud provides multiple virtual machine types across multiple regions. Select a virtual machine type to continue.
Interact with your AI model
Go to the interactive playground to test your AI model live.
Monitor your deployments
PeriFlow cloud monitors your deployments automatically. Look at how your AI model is performing with our supercharged engine.
Pay only for the compute resources your gen AI model (LLM) uses
Allows creating multiple deployments but only one can be running at a time
$0.00035 per sec ($1.26 per hour)
$0.0014 per sec ($5.04 per hour)
$0.00175 per sec ($6.3 per hour)
* current pricing is subject to change
Custom pricing based on contracts
Cheaper deployment alternative for development purposes