Multimodal AI Models
AI models that accept images, audio, video, or other non-text inputs.
100 models
| # | Model | Provider | Context | Max Output | Input $/1M | Output $/1M | Modalities |
|---|---|---|---|---|---|---|---|
| 1 | xAI: Grok 4.20 Beta | xAI | 2M | N/A | $2.00 | $6.00 | Text + Image → Text |
| 2 | xAI: Grok 4.1 Fast | xAI | 2M | 30K | $0.20 | $0.50 | Text + Image → Text |
| 3 | Auto Router | OpenRouter | 2M | N/A | <$0.01 | <$0.01 | Text + Image + Audio + File + Video → Text + Image |
| 4 | xAI: Grok 4.20 Multi-Agent Beta | xAI | 2M | N/A | $2.00 | $6.00 | Text + Image → Text |
| 5 | xAI: Grok 4 Fast | xAI | 2M | 30K | $0.20 | $0.50 | Text + Image → Text |
| 6 | OpenAI: GPT-5.4 | OpenAI | 1.1M | 128K | $2.50 | $15 | Text + Image + File → Text |
| 7 | OpenAI: GPT-5.4 Pro | OpenAI | 1.1M | 128K | $30 | $180 | Text + Image + File → Text |
| 8 | Google: Gemini 2.5 Flash Lite | 1.0M | 66K | $0.10 | $0.40 | Text + Image + File + Audio + Video → Text | |
| 9 | Google: Gemini 2.5 Flash | 1.0M | 66K | $0.30 | $2.50 | File + Image + Text + Audio + Video → Text | |
| 10 | Google: Gemini 3.1 Pro Preview Custom Tools | 1.0M | 66K | $2.00 | $12 | Text + Audio + Image + Video + File → Text | |
| 11 | Google: Gemini 3.1 Flash Lite Preview | 1.0M | 66K | $0.25 | $1.50 | Text + Image + Video + File + Audio → Text | |
| 12 | Google: Gemini 2.0 Flash | 1.0M | 8K | $0.10 | $0.40 | Text + Image + File + Audio + Video → Text | |
| 13 | Google: Gemini 3 Pro Preview | 1.0M | 66K | $2.00 | $12 | Text + Image + File + Audio + Video → Text | |
| 14 | Google: Gemini 3 Flash Preview | 1.0M | 66K | $0.50 | $3.00 | Text + Image + File + Audio + Video → Text | |
| 15 | Google: Gemini 3.1 Pro Preview | 1.0M | 66K | $2.00 | $12 | Audio + File + Image + Text + Video → Text | |
| 16 | Google: Gemini 2.5 Pro Preview 05-06 | 1.0M | 66K | $1.25 | $10 | Text + Image + File + Audio + Video → Text | |
| 17 | Google: Gemini 2.0 Flash Lite | 1.0M | 8K | $0.07 | $0.30 | Text + Image + File + Audio + Video → Text | |
| 18 | Google: Gemini 2.5 Flash Lite Preview 09-2025 | 1.0M | 66K | $0.10 | $0.40 | Text + Image + File + Audio + Video → Text | |
| 19 | Google: Gemini 2.5 Pro Preview 06-05 | 1.0M | 66K | $1.25 | $10 | File + Image + Text + Audio → Text | |
| 20 | Meta: Llama 4 Maverick | Meta | 1.0M | 16K | $0.15 | $0.60 | Text + Image → Text |
| 21 | Google: Gemini 2.5 Pro | 1.0M | 66K | $1.25 | $10 | Text + Image + File + Audio + Video → Text | |
| 22 | OpenAI: GPT-4.1 Mini | OpenAI | 1.0M | 33K | $0.40 | $1.60 | Image + Text + File → Text |
| 23 | OpenAI: GPT-4.1 | OpenAI | 1.0M | 33K | $2.00 | $8.00 | Image + Text + File → Text |
| 24 | OpenAI: GPT-4.1 Nano | OpenAI | 1.0M | 33K | $0.10 | $0.40 | Image + Text + File → Text |
| 25 | MiniMax: MiniMax-01 | MiniMax | 1.0M | 1.0M | $0.20 | $1.10 | Text + Image → Text |
| 26 | Qwen: Qwen3.5 Plus 2026-02-15 | Qwen | 1M | 66K | $0.26 | $1.56 | Text + Image + Video → Text |
| 27 | Amazon: Nova Premier 1.0 | Amazon | 1M | 32K | $2.50 | $13 | Text + Image → Text |
| 28 | Qwen: Qwen3.5-Flash | Qwen | 1M | 66K | $0.07 | $0.26 | Text + Image + Video → Text |
| 29 | Anthropic: Claude Sonnet 4.6 | Anthropic | 1M | 128K | $3.00 | $15 | Text + Image → Text |
| 30 | Amazon: Nova 2 Lite | Amazon | 1M | 66K | $0.30 | $2.50 | Text + Image + Video + File → Text |
| 31 | Anthropic: Claude Sonnet 4.5 | Anthropic | 1M | 64K | $3.00 | $15 | Text + Image + File → Text |
| 32 | Anthropic: Claude Opus 4.6 | Anthropic | 1M | 128K | $5.00 | $25 | Text + Image → Text |
| 33 | OpenAI: GPT-5 Nano | OpenAI | 400K | 128K | $0.05 | $0.40 | Text + Image + File → Text |
| 34 | OpenAI: GPT-5.1 | OpenAI | 400K | 128K | $1.25 | $10 | Image + Text + File → Text |
| 35 | OpenAI: GPT-5 | OpenAI | 400K | 128K | $1.25 | $10 | Text + Image + File → Text |
| 36 | OpenAI: GPT-5.1-Codex | OpenAI | 400K | 128K | $1.25 | $10 | Text + Image → Text |
| 37 | OpenAI: GPT-5.1-Codex-Mini | OpenAI | 400K | 100K | $0.25 | $2.00 | Image + Text → Text |
| 38 | OpenAI: GPT-5.4 Mini | OpenAI | 400K | 128K | $0.75 | $4.50 | File + Image + Text → Text |
| 39 | OpenAI: GPT-5.4 Nano | OpenAI | 400K | 128K | $0.20 | $1.25 | File + Image + Text → Text |
| 40 | OpenAI: GPT-5.2-Codex | OpenAI | 400K | 128K | $1.75 | $14 | Text + Image → Text |
| 41 | OpenAI: GPT-5 Codex | OpenAI | 400K | 128K | $1.25 | $10 | Text + Image → Text |
| 42 | OpenAI: GPT-5.2 Pro | OpenAI | 400K | 128K | $21 | $168 | Image + Text + File → Text |
| 43 | OpenAI: GPT-5.2 | OpenAI | 400K | 128K | $1.75 | $14 | File + Image + Text → Text |
| 44 | OpenAI: GPT-5.3-Codex | OpenAI | 400K | 128K | $1.75 | $14 | Text + Image + File → Text |
| 45 | OpenAI: GPT-5.1-Codex-Max | OpenAI | 400K | 128K | $1.25 | $10 | Text + Image → Text |
| 46 | OpenAI: GPT-5 Mini | OpenAI | 400K | 128K | $0.25 | $2.00 | Text + Image + File → Text |
| 47 | OpenAI: GPT-5 Image Mini | OpenAI | 400K | 128K | $2.50 | $2.00 | File + Image + Text → Image + Text |
| 48 | OpenAI: GPT-5 Pro | OpenAI | 400K | 128K | $15 | $120 | Image + Text + File → Text |
| 49 | OpenAI: GPT-5 Image | OpenAI | 400K | 128K | $10 | $10 | Image + Text + File → Image + Text |
| 50 | Meta: Llama 4 Scout | Meta | 328K | 16K | $0.08 | $0.30 | Text + Image → Text |
| 51 | Amazon: Nova Pro 1.0 | Amazon | 300K | 5K | $0.80 | $3.20 | Text + Image → Text |
| 52 | Amazon: Nova Lite 1.0 | Amazon | 300K | 5K | $0.06 | $0.24 | Text + Image → Text |
| 53 | Xiaomi: MiMo-V2-Omni | Xiaomi | 262K | 66K | $0.40 | $2.00 | Text + Audio + Image + Video → Text |
| 54 | Qwen: Qwen3.5-27B | Qwen | 262K | 66K | $0.20 | $1.56 | Text + Image + Video → Text |
| 55 | ByteDance Seed: Seed-2.0-Lite | ByteDance Seed | 262K | 131K | $0.25 | $2.00 | Text + Image + Video → Text |
| 56 | ByteDance Seed: Seed-2.0-Mini | ByteDance Seed | 262K | 131K | $0.10 | $0.40 | Text + Image + Video → Text |
| 57 | Qwen: Qwen3.5-35B-A3B | Qwen | 262K | 66K | $0.16 | $1.30 | Text + Image + Video → Text |
| 58 | Qwen: Qwen3.5-122B-A10B | Qwen | 262K | 66K | $0.26 | $2.08 | Text + Image + Video → Text |
| 59 | Mistral: Ministral 3 14B 2512 | Mistral AI | 262K | N/A | $0.20 | $0.20 | Text + Image → Text |
| 60 | Qwen: Qwen3.5 397B A17B | Qwen | 262K | 66K | $0.39 | $2.34 | Text + Image + Video → Text |
| 61 | MoonshotAI: Kimi K2.5 | Moonshotai | 262K | 66K | $0.45 | $2.20 | Text + Image → Text |
| 62 | ByteDance Seed: Seed 1.6 Flash | ByteDance Seed | 262K | 33K | $0.07 | $0.30 | Image + Text + Video → Text |
| 63 | ByteDance Seed: Seed 1.6 | ByteDance Seed | 262K | 33K | $0.25 | $2.00 | Image + Text + Video → Text |
| 64 | Mistral: Mistral Large 3 2512 | Mistral AI | 262K | N/A | $0.50 | $1.50 | Text + Image → Text |
| 65 | Qwen: Qwen3 VL 235B A22B Instruct | Qwen | 262K | N/A | $0.20 | $0.88 | Text + Image → Text |
| 66 | Mistral: Ministral 3 8B 2512 | Mistral AI | 262K | N/A | $0.15 | $0.15 | Text + Image → Text |
| 67 | Mistral: Mistral Small 4 | Mistral AI | 262K | N/A | $0.15 | $0.60 | Text + Image → Text |
| 68 | Qwen: Qwen3.5-9B | Qwen | 256K | 66K | $0.05 | $0.15 | Text + Image + Video → Text |
| 69 | xAI: Grok 4 | xAI | 256K | N/A | $3.00 | $15 | Image + Text → Text |
| 70 | OpenAI: o3 Mini High | OpenAI | 200K | 100K | $1.10 | $4.40 | Text + File → Text |
| 71 | Anthropic: Claude Opus 4.5 | Anthropic | 200K | 64K | $5.00 | $25 | File + Image + Text → Text |
| 72 | Perplexity: Sonar Pro Search | Perplexity | 200K | 8K | $3.00 | $15 | Text + Image → Text |
| 73 | OpenAI: o3 Mini | OpenAI | 200K | 100K | $1.10 | $4.40 | Text + File → Text |
| 74 | OpenAI: o3 Deep Research | OpenAI | 200K | 100K | $10 | $40 | Image + Text + File → Text |
| 75 | OpenAI: o4 Mini Deep Research | OpenAI | 200K | 100K | $2.00 | $8.00 | File + Image + Text → Text |
| 76 | Anthropic: Claude 3.5 Haiku | Anthropic | 200K | 8K | $0.80 | $4.00 | Text + Image → Text |
| 77 | OpenAI: o1 | OpenAI | 200K | 100K | $15 | $60 | Text + Image + File → Text |
| 78 | Anthropic: Claude 3 Haiku | Anthropic | 200K | 4K | $0.25 | $1.25 | Text + Image → Text |
| 79 | Anthropic: Claude 3.7 Sonnet (thinking) | Anthropic | 200K | 64K | $3.00 | $15 | Text + Image + File → Text |
| 80 | OpenAI: o3 | OpenAI | 200K | 100K | $2.00 | $8.00 | Image + Text + File → Text |
| 81 | OpenAI: o4 Mini | OpenAI | 200K | 100K | $1.10 | $4.40 | Image + Text + File → Text |
| 82 | OpenAI: o3 Pro | OpenAI | 200K | 100K | $20 | $80 | Text + File + Image → Text |
| 83 | Anthropic: Claude 3.7 Sonnet | Anthropic | 200K | 64K | $3.00 | $15 | Text + Image + File → Text |
| 84 | Free Models RouterFree | OpenRouter | 200K | N/A | Free | Free | Text + Image → Text |
| 85 | OpenAI: o1-pro | OpenAI | 200K | 100K | $150 | $600 | Text + Image + File → Text |
| 86 | Anthropic: Claude Haiku 4.5 | Anthropic | 200K | 64K | $1.00 | $5.00 | Image + Text → Text |
| 87 | Anthropic: Claude Opus 4 | Anthropic | 200K | 32K | $15 | $75 | Image + Text + File → Text |
| 88 | Anthropic: Claude 3.5 Sonnet | Anthropic | 200K | 8K | $6.00 | $30 | Text + Image + File → Text |
| 89 | Anthropic: Claude Opus 4.1 | Anthropic | 200K | 32K | $15 | $75 | Image + Text + File → Text |
| 90 | Anthropic: Claude Sonnet 4 | Anthropic | 200K | 64K | $3.00 | $15 | Image + Text + File → Text |
| 91 | OpenAI: o4 Mini High | OpenAI | 200K | 100K | $1.10 | $4.40 | Image + Text + File → Text |
| 92 | Perplexity: Sonar Pro | Perplexity | 200K | 8K | $3.00 | $15 | Text + Image → Text |
| 93 | Meta: Llama Guard 4 12B | Meta | 164K | N/A | $0.18 | $0.18 | Image + Text → Text |
| 94 | Mistral: Mistral Small 3.1 24B | Mistral AI | 131K | 131K | $0.03 | $0.11 | Text + Image → Text |
| 95 | Qwen: Qwen3 VL 235B A22B Thinking | Qwen | 131K | 33K | $0.26 | $2.60 | Text + Image → Text |
| 96 | Mistral: Mistral Medium 3 | Mistral AI | 131K | N/A | $0.40 | $2.00 | Text + Image → Text |
| 97 | Qwen: Qwen3 VL 30B A3B Instruct | Qwen | 131K | 33K | $0.13 | $0.52 | Text + Image → Text |
| 98 | Arcee AI: Spotlight | Arcee AI | 131K | 66K | $0.18 | $0.18 | Image + Text → Text |
| 99 | Qwen: Qwen3 VL 30B A3B Thinking | Qwen | 131K | 33K | $0.13 | $1.56 | Text + Image → Text |
| 100 | Qwen: Qwen3 VL 8B Instruct | Qwen | 131K | 33K | $0.08 | $0.50 | Image + Text → Text |