OpenAI: GPT-4o Audio

OpenAIID: openai/gpt-4o-audio-preview

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs are currently not supported. Audio tokens are priced at $40 per million input and $80 per million output audio tokens.

Pricing per 1M Tokens

Input (Prompt)$2.50
Output (Completion)$10
Cache ReadFree
Cache WriteFree
ImageN/A

Specifications

Context Length128K
Max Output Tokens16K
Input ModalitiesAudio + Text
Output ModalitiesText + Audio
TokenizerGPT
Instruct TypeN/A
Top Provider Context128K
Top Provider Max Output16K
ModeratedYes

Compare this model

See how OpenAI: GPT-4o Audio stacks up against other models.

More from OpenAI

Last updated: March 23, 2026

First tracked: March 23, 2026