Fastest AI Models -- Speed Rankings

When response time matters. Real-time applications, chatbots, and interactive tools need speed above all. Ranked by tokens per second output speed.

#ModelProviderSpeedLatencyIntelligencePrice/1M
1nvidiaNVIDIA: Nemotron 3 SuperNVIDIA402 tok/s0.59s36.0$0.10
2googleGoogle: Gemini 2.5 FlashGoogle343 tok/s0.31s19.4$0.30
3AAmazon: Nova Micro 1.0Amazon329 tok/s0.34s10.3$0.04
4openaiOpenAI: gpt-oss-20bOpenAI304 tok/s0.44s24.5$0.03
5mistralaiMistral: Ministral 3 3B 2512Mistral AI291 tok/s0.25s11.2$0.10
6openaiOpenAI: gpt-oss-120bOpenAI289 tok/s0.49s33.3$0.04
7openaiOpenAI: GPT-5OpenAI221 tok/s2.15s44.4$1.25
8AAmazon: Nova Lite 1.0Amazon219 tok/s0.38s12.7$0.06
9mistralaiMistral: Mistral Small 3Mistral AI216 tok/s0.29s15.1$0.05
10googleGoogle: Gemini 2.5 Flash Lite Preview 09-2025Google214 tok/s5.29s33.5$0.10
11googleGoogle: Gemini 3.1 Flash Lite PreviewGoogle214 tok/s5.29s33.5$0.25
12mistralaiMistral: Devstral Small 1.1Mistral AI204 tok/s0.34s19.5$0.10
13x-aixAI: Grok 3 MinixAI199 tok/s0.37s32.1$0.30
14meta-llamaMeta: Llama 3.1 8B InstructMeta197 tok/s0.44s11.8$0.02
15meta-llamaMeta: Llama 3 8B InstructMeta197 tok/s0.44s11.8$0.03
16googleGoogle: Gemini 3 Flash PreviewGoogle186 tok/s0.70s35.0$0.50
17AAmazon: Nova 2 LiteAmazon176 tok/s0.53s18.0$0.30
18x-aixAI: Grok Code Fast 1xAI172 tok/s2.93s28.7$0.20
19openaiOpenAI: GPT-5 CodexOpenAI170 tok/s4.79s44.6$1.25
20mistralaiMistral: Mistral Small 3.1 24BMistral AI159 tok/s0.40s12.7$0.03
21mistralaiMistral: Mistral Small 3.2 24BMistral AI159 tok/s0.40s12.7$0.07
22openaiOpenAI: GPT-4o AudioOpenAI157 tok/s0.48s17.3$2.50
23mistralaiMistral: Mistral 7B Instruct v0.1Mistral AI157 tok/s0.29s7.4$0.11
24mistralaiMistral: Mistral Small CreativeMistral AI157 tok/s0.40s10.2$0.10
25openaiOpenAI: GPT-5 NanoOpenAI154 tok/s0.68s13.8$0.05
26AAmazon: Nova Pro 1.0Amazon151 tok/s8.94s35.7$0.80
27openaiOpenAI: o3OpenAI143 tok/s6.39s25.9$2.00
28mistralaiMistral: Devstral MediumMistral AI143 tok/s0.40s18.7$0.40
29mistralaiMistral: Ministral 3 8B 2512Mistral AI141 tok/s0.27s14.8$0.15
30openaiOpenAI: o4 Mini Deep ResearchOpenAI140 tok/s17.07s33.1$2.00
31nvidiaNVIDIA: Nemotron Nano 12B 2 VLNVIDIA138 tok/s0.55s10.1$0.20
32x-aixAI: Grok 4 FastxAI137 tok/s3.83s35.1$0.20
33meta-llamaMeta: Llama 4 MaverickMeta129 tok/s0.47s18.4$0.15
34meta-llamaMeta: Llama 4 ScoutMeta129 tok/s0.45s13.5$0.08
35mistralaiMistral: Ministral 3 14B 2512Mistral AI124 tok/s0.29s16.0$0.20
36nvidiaNVIDIA: Nemotron Nano 9B V2NVIDIA123 tok/s0.22s14.8$0.04
37googleGoogle: Gemini 2.5 Pro Preview 06-05Google120 tok/s21.69s34.6$1.25
38googleGoogle: Gemini 2.5 ProGoogle120 tok/s21.69s34.6$1.25
39googleGoogle: Nano Banana Pro (Gemini 3 Pro Image Preview)Google119 tok/s21.66s48.4$2.00
40anthropicAnthropic: Claude 3 HaikuAnthropic118 tok/s9.88s37.1$0.25
41anthropicAnthropic: Claude Haiku 4.5Anthropic118 tok/s9.88s37.1$1.00
42googleGoogle: Gemini 3.1 Pro PreviewGoogle117 tok/s21.91s57.2$2.00
43googleGoogle: Gemini 3.1 Pro Preview Custom ToolsGoogle117 tok/s21.91s57.2$2.00
44googleGoogle: Gemini 3 Pro PreviewGoogle116 tok/s3.88s41.3$2.00
45openaiOpenAI: GPT-5 ChatOpenAI116 tok/s0.63s21.8$1.25
46nvidiaNVIDIA: Nemotron 3 Nano 30B A3BNVIDIA116 tok/s1.38s24.3$0.05
47openaiOpenAI: o1-proOpenAI115 tok/s17.50s30.8$150
48PPerplexity: Sonar Pro SearchPerplexity113 tok/s1.01s15.5$3.00
49openaiOpenAI: GPT-3.5 Turbo (older v0613)OpenAI105 tok/s0.44s9.0$1.00
50openaiOpenAI: GPT-3.5 Turbo 16kOpenAI105 tok/s0.44s9.0$3.00

Benchmark data by Artificial Analysis

Not sure which model to pick?

Take our 30-second quiz and get a personalized recommendation.

Take the Quiz