- June 25, 2024
- 2 min read
Level Up Your Client-Side Interactions with Friendli's gRPC Support

Alongside HTTP methods, Friendli Inference, your favorite LLM serving engine for generating creative text formats, also offers ways to interact with its completion services through gRPC! This blog post dives into what gRPC is and how you can leverage it within the Friendli client for a more efficient and performant experience. You can also access relevant information from our documentation.
What is gRPC?
gRPC is a high performance Remote Procedure Call (RPC) framework. It's a modern open-source framework that allows applications to communicate efficiently by treating remote service calls like local function calls. This translates to advanced functionalities and more streamlined communication compared to traditional methods like REST APIs.
Friendli Containers Support HTTP and gRPC
Friendli Containers offer its functionalities through both the conventional HTTP and gRPC methods. The gRPC support for its completion services also supports response-streaming gRPC, allowing you to receive results in chunks as they become available, ideal for scenarios where responses might be lengthy.
Using gRPC with the friendli-client SDK
To utilize gRPC with Friendli Containers, you'll need the friendli-client
SDK (version 1.4.1 or later). Here's a breakdown of how to integrate it into your code:
1. Enable gRPC when Launching Friendli Container:
Start the Friendli Container with the --grpc true
flag to activate the gRPC server for completions.
sh
2. Choose Your Flavor: Sync or Async
This article provides examples for both synchronous and asynchronous programming styles. Assuming that the Friendli Container gRPC server is running on 0.0.0.0:8000:
- Synchronous Approach:
python
- Asynchronous Approach:
python
3. Remember to Close Connections Properly
By default, the library closes the HTTP and gRPC connections when the client
object is garbage-collected. However, for better resource management, it's recommended to explicitly close connections using the .close()
method or employing a context manager with a with
block. For example, you could implement the example with proper closing using a context manager:
- Synchronous Approach:
python
- Asynchronous Approach:
python
Embrace Streamlined Communication with Friendli's gRPC
gRPC offers a powerful alternative for interacting with Friendli's completion services. With its ability to handle streaming responses, gRPC provides an efficient and performant solution for various use cases. So, next time you're building applications that require real-time or chunked data from Friendli, consider leveraging the power of gRPC!
Learn more about FriendliAI at our website, blogs, or by using the Friendli Suite!
Written by
FriendliAI Tech & Research
Share
General FAQ
What is FriendliAI?
FriendliAI is a GPU-inference platform that lets you deploy, scale, and monitor large language and multimodal models in production, without owning or managing GPU infrastructure. We offer three things for your AI models: Unmatched speed, cost efficiency, and operational simplicity. Find out which product is the best fit for you in here.
How does FriendliAI help my business?
Our Friendli Inference allows you to squeeze more tokens-per-second out of every GPU. Because you need fewer GPUs to serve the same load, the true metric—tokens per dollar—comes out higher even if the hourly GPU rate looks similar on paper. View pricing
Which models and modalities are supported?
Over 380,000 text, vision, audio, and multi-modal models are deployable out of the box. You can also upload custom models or LoRA adapters. Explore models
Can I deploy models from Hugging Face directly?
Yes. A one-click deploy by selecting “Friendli Endpoints” on the Hugging Face Hub will take you to our model deployment page. The page provides an easy-to-use interface for setting up Friendli Dedicated Endpoints, a managed service for generative AI inference. Learn more about our Hugging Face partnership
Still have questions?
If you want a customized solution for that key issue that is slowing your growth, contact@friendli.ai or click Contact Sales — our experts (not a bot) will reply within one business day.