Updated March 26, 2026· Based on independent benchmark data
Gemini 3.1 Pro Preview leads in intelligence with a score of 57.2 vs 14.2. Qwen3 4B (Reasoning) is 18.2x cheaper at $0.11/1M tokens vs $2.00/1M.
| Metric | Qwen3 4B (Reasoning) | Gemini 3.1 Pro Preview |
|---|---|---|
| Intelligence Score | 14.2 | 57.2 |
| Coding Score | N/A | 55.5 |
| Math Score | 22.3 | N/A |
| Speed (tok/s) | 103 tok/s | 113 tok/s |
| Latency (TTFT) | 0.96s | 23.84s |
| Input Price / 1M tokens | $0.11 | $2.00 |
| Output Price / 1M tokens | $1.26 | $12 |
| Context Window | N/A | 1.0M |
| Max Output Tokens | N/A | N/A |
| Input Modalities | Text | Audio + File + Image + Text + Video |
Gemini 3.1 Pro Preview outperforms Qwen3 4B (Reasoning) on the intelligence index with a score of 57.2 compared to 14.2.
Both models deliver similar output speeds: Qwen3 4B (Reasoning) at 103 tok/s and Gemini 3.1 Pro Preview at 113 tok/s. Time to first token is 0.96s for Qwen3 4B (Reasoning) vs 23.84s for Gemini 3.1 Pro Preview, which affects perceived responsiveness in interactive applications.
Qwen3 4B (Reasoning) is more affordable at $0.11/1M input tokens ($1.26/1M output), while Gemini 3.1 Pro Preview costs $2.00/1M input ($12/1M output). That makes Gemini 3.1 Pro Preview 18.2x more expensive per token, which can add up significantly at scale. For a typical workload of 100 requests per day at 2,000 tokens each, Qwen3 4B (Reasoning) would cost approximately $0.66/month vs $12.00/month for Gemini 3.1 Pro Preview in input costs alone.
Choose Qwen3 4B (Reasoning) when you need lower cost. Choose Gemini 3.1 Pro Preview when you need higher intelligence (57.2).
Qwen3 4B (Reasoning) is cheaper at $0.11/1M input tokens vs $2.00/1M for Gemini 3.1 Pro Preview.
Gemini 3.1 Pro Preview is faster, producing output at 113 tok/s compared to Qwen3 4B (Reasoning)'s 103 tok/s.
No, Qwen3 4B (Reasoning) does not support image input. However, Gemini 3.1 Pro Preview does support images.
It depends on your priorities. Gemini 3.1 Pro Preview scores higher on intelligence (57.2), but Qwen3 4B (Reasoning) may be better for specific use cases like budget-conscious projects or speed-critical applications.
Data last synced: March 26, 2026
| Output Modalities |
| Text |
| Text |
| Free Tier | No | No |