OpenAI: GPT-4o Audio
OpenAIID: openai/gpt-4o-audio-preview
The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs are currently not supported. Audio tokens are priced at $40 per million input and $80 per million output audio tokens.
Pricing per 1M Tokens
| Input (Prompt) | $2.50 |
| Output (Completion) | $10 |
| Cache Read | Free |
| Cache Write | Free |
| Image | N/A |
Specifications
| Context Length | 128K |
| Max Output Tokens | 16K |
| Input Modalities | Audio + Text |
| Output Modalities | Text + Audio |
| Tokenizer | GPT |
| Instruct Type | N/A |
| Top Provider Context | 128K |
| Top Provider Max Output | 16K |
| Moderated | Yes |
Compare this model
See how OpenAI: GPT-4o Audio stacks up against other models.
More from OpenAI
Last updated: March 23, 2026
First tracked: March 23, 2026