OpenAI: GPT Audio

OpenAIID: openai/gpt-audio

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced at $32 per million input tokens and $64 per million output tokens.

Pricing per 1M Tokens

Input (Prompt)$2.50
Output (Completion)$10
Cache ReadFree
Cache WriteFree
ImageN/A

Specifications

Context Length128K
Max Output Tokens16K
Input ModalitiesText + Audio
Output ModalitiesText + Audio
TokenizerGPT
Instruct TypeN/A
Top Provider Context128K
Top Provider Max Output16K
ModeratedYes

Compare this model

See how OpenAI: GPT Audio stacks up against other models.

More from OpenAI

Last updated: March 23, 2026

First tracked: March 23, 2026