Build Smarter Agents with Nemotron 3 Nano Omni on FriendliAI — Explore models
curl --request POST \
--url https://api.friendli.ai/dedicated/v1/audio/transcriptions \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: multipart/form-data' \
--form file='@example-file' \
--form 'model=(endpoint-id)'{
"text": "Hello, how are you?",
"usage": {
"type": "tokens",
"input_tokens": 20,
"output_tokens": 10,
"total_tokens": 30,
"input_audio_length_ms": 18000,
"processed_audio_length_ms": 24000,
"input_token_details": {
"audio_tokens": 10,
"text_tokens": 10
}
}
}Transcribe audio files to text using your Friendli Dedicated Endpoint. Upload an audio file and receive a text transcription from the deployed model.
curl --request POST \
--url https://api.friendli.ai/dedicated/v1/audio/transcriptions \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: multipart/form-data' \
--form file='@example-file' \
--form 'model=(endpoint-id)'{
"text": "Hello, how are you?",
"usage": {
"type": "tokens",
"input_tokens": 20,
"output_tokens": 10,
"total_tokens": 30,
"input_audio_length_ms": 18000,
"processed_audio_length_ms": 24000,
"input_token_details": {
"audio_tokens": 10,
"text_tokens": 10
}
}
}Given an audio file, the model transcribes it into text. To request successfully, it is mandatory to enter a Personal API Key (e.g. flp_XXX) value in the Bearer Token field. Refer to the authentication section on our introduction page to learn how to acquire this variable and visit here to generate your API Key.Documentation Index
Fetch the complete documentation index at: https://friendli.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
ID of team to run requests as (optional parameter).
ID of target endpoint. If you want to send request to specific adapter, use the format "YOUR_ENDPOINT_ID:YOUR_ADAPTER_ROUTE". Otherwise, you can just use "YOUR_ENDPOINT_ID" alone.
"(endpoint-id)"
The audio file object (not file name) to transcribe, in one of these formats: mp3, wav, flac, ogg, and many other standard audio formats.
Controls how the audio is cut into chunks. When set to "auto", the server first normalizes loudness and then uses voice activity detection (VAD) to choose boundaries. server_vad object can be provided to tweak VAD detection parameters manually. If unset, the audio is transcribed as a single block.
"auto"Whether to stream the transcription result. When set to true, the transcription result will be streamed as server-sent events once generated.
The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
Successfully transcribed the audio file.
The transcribed text.
Hide child attributes
The type of the usage object. Always tokens for this variant.
"tokens"Number of input tokens billed for this request.
Number of output tokens generated.
Total number of tokens used (input + output).
The length of the input audio in milliseconds.
The length of the processed audio in milliseconds.