OpenAI: GPT-4o Audio

OpenAIID: openai/gpt-4o-audio-preview

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs are currently not supported. Audio tokens are priced at $40 per million input and $80 per million output audio tokens.

Pricing per 1M Tokens

Input (Prompt)	$2.50
Output (Completion)	$10
Cache Read	Free
Cache Write	Free
Image	N/A

Specifications

Context Length	128K
Max Output Tokens	16K
Input Modalities	Audio + Text
Output Modalities	Text + Audio
Tokenizer	GPT
Instruct Type	N/A
Top Provider Context	128K
Top Provider Max Output	16K
Moderated	Yes