Friendli provides audio and speech features through Friendli Dedicated Endpoints, allowing you to convert audio files to text and perform various AI tasks. This guide explains how to use these features with examples for both the Playground and API interfaces.

You can find the full list of available models here.

ASR - /v1/audio/transcriptions

Our ASR (Automatic Speech Recognition) service is designed for efficient audio transcription.
By default, audio input is limited to 30 seconds. If you require support for longer audio inputs, please contact us.

API Usage Example

curl -X POST https://api.friendli.ai/dedicated/v1/audio/transcriptions \
  -H "Authorization: Bearer $FRIENDLI_TOKEN" \
  -H 'Content-Type: multipart/form-data' \
  -F file=@/path/to/audio/file.mp3 \
  -F model="your endpoint id"

Supported Models

We support a variety of powerful ASR models, including:


Audio Modality - /v1/chat/completions

The audio modality endpoint allows you to combine audio and text inputs, enabling advanced AI tasks.

This endpoint is ideal for:

  • Complex audio and text analysis
  • Conversational AI
  • Tasks requiring diverse inference, such as summarization, sentiment analysis, and question answering.

By default, audio input is limited to 10 seconds. If you require support for longer audio inputs, please contact us.

Passing a URL
curl -X POST https://api.friendli.ai/dedicated/v1/chat/completions \
  -H "Authorization: Bearer $FRIENDLI_TOKEN" \
  -d '{
    "model": "your endpoint id",
    "messages": [{
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What's in this audio?"
            },
            {
                "type": "audio_url",
                "audio_url": {
                    "url": "https://example.com/path/to/audio.mp3"
                }
            }
        ]
    }]
  }'

Supported Models

We offer a range of multi-modal models, including:


Supported Audio Formats

Our platform supports a wide range of audio formats compatible with the librosa library, ensuring broad compatibility for your applications. Supported formats include:

  • MP3 (.mp3)
  • WAV (.wav)
  • FLAC (.flac)
  • OGG (.ogg)
  • And many other standard audio formats