Friendli Serverless Endpoints: Unleashing Generative AI for Everyone

Friendli Serverless Endpoints: Unleashing Generative AI for Everyone thumbnail

FriendliAI, the world’s leading generative AI engine company, has launched its Friendli Serverless Endpoints, unlocking a new era of accessibility for generative AI inference. Users can access open-source generative AI models with simple API calls with the lowest cost on the market. This innovative service brings the power of Friendli Engine, our GPU-optimized inference engine, to anyone, regardless of their technical expertise.

Say goodbye to deployment headaches: Gone are the days of wrestling with infrastructure and optimizing models on complex GPU machines. Friendli Serverless Endpoints takes care of everything, allowing you to harness the transformative potential of generative AI models right within your applications.

Who is it for? Whether you're:

  • A curious developer eager to experiment with cutting-edge LLMs like Llama-2 and image creation models like Stable Diffusion,
  • A product manager seeking to integrate text generation or image creation into your product, or
  • A researcher exploring preliminary LLM features before diving into deep-dive fine-tuning,

Friendli Serverless Endpoints provides the perfect platform to unlock the magic of generative AI.

No more barriers: Friendli Serverless Endpoints removes the technical hurdles that often block the adoption of generative AI. You no longer need to worry about setting up the infrastructure, optimizing the model serving, or even choosing the right GPU. Simply connect your application to Friendli's secure endpoints and start weaving generative AI magic into your workflow with the lowest cost on the market.

Power under the hood: While Friendli Serverless Endpoints simplifies your experience, Friendli Engine, the beating heart of the service, delivers unparalleled inference serving performance and cost-efficiency.

  • Reduced costs: $0.2/M tokens for Llama-2 13B and $0.8/M tokens for Llama-2 70B, thanks to Friendli Engine.
  • Low latency: 2-4x faster compared to other leading solutions that use vLLM, ensuring a smooth and responsive generative AI experience.

Open doors to diverse models: Get started with a curated selection of popular open-source models including:

  • Large language models: Llama-2 and Llama-2-chat (13B and 70B), Mistral 7B and Mistral-7B-instruct
  • Visual models: Stable Diffusion v1.5
  • And more models soon to come!

Diverse models are supported in Friendli Serverless Endpoints including Llama-2, Llama-2-Chat, Mistral 7B, Mistral-7B-instruct, Stable Diffusion v1.5, and more-FriendliAI

More choices, more power: While Friendli Serverless Endpoints democratizes generative AI, FriendliAI also offers Friendli Dedicated Endpoints for advanced users. This premium service provides dedicated GPU instances, allowing you to serve your customized models reliably with high performance and low costs.

Start your generative journey today: Friendli Serverless Endpoints is the key to unlocking the incredible potential of generative AI. Sign up today and start building applications that leverage the power of language, image, and code without getting bogged down in technical complexities.

The future is generative. With FriendliAI, it's more accessible than ever.



Share

Related Posts

The LLM Serving Engine Showdown: Friendli Engine Outshines thumbnail
  • January 12, 2024
  • 3 min read

The LLM Serving Engine Showdown: Friendli Engine Outshines

LLM
Serving Engine
Groundbreaking Performance of the Friendli Engine for LLM Serving on an NVIDIA H100 GPU thumbnail
  • December 11, 2023
  • 3 min read

Groundbreaking Performance of the Friendli Engine for LLM Serving on an NVIDIA H100 GPU

LLM
NVIDIA H100
See all from blog