Model Rankings

AI models ranked

Curated rankings of frontier and open-source AI models for developers. Scored on coding ability, reasoning, instruction following, speed, and cost efficiency.

43 models - page 1 of 4

🥇

Claude Sonnet 4.6

âš¡ Top PickTextCode
Anthropic200K ctx

Anthropic's current frontier model. State-of-the-art on SWE-bench, best-in-class instruction following, and extended thinking built-in. The go-to for agentic coding workflows.

Performance
Coding
88
Reasoning
84
Instruction
92
Speed
68
Cost eff.
62
Cost$3 in / $15 out per 1M
Best for
Agentic codingSWE-bench leaderExtended thinkingInstruction accuracy
🥈

o3

🧠 Best ReasoningTextCode
OpenAI200K ctx

OpenAI's most powerful reasoning model. Unmatched on complex logic, math, and scientific reasoning tasks.

Performance
Coding
85
Reasoning
95
Instruction
78
Speed
28
Cost eff.
15
Cost$10 in / $40 out per 1M
Best for
Complex reasoningMath & scienceCompetition codingDeep analysis
🥉

Claude 3.7 Sonnet

TextCode
Anthropic200K ctx

Anthropic's previous-generation flagship with extended thinking. Exceptional coding and instruction following - predecessor to the Sonnet 4.x line, still widely deployed.

Performance
Coding
83
Reasoning
81
Instruction
88
Speed
68
Cost eff.
65
Cost$3 in / $15 out per 1M
Best for
Agentic codingExtended thinkingLong contextInstruction following
#4

Gemini 2.5 Pro

TextCode
Google1M ctx

Google's most capable model with an industry-leading 1M token context window. Excellent at long-context analysis and coding.

Performance
Coding
84
Reasoning
85
Instruction
82
Speed
60
Cost eff.
70
Cost$1.25 in / $10 out per 1M
Best for
1M context windowLong document analysisMultimodalCoding
#5

Grok 4.20 Reasoning

Text
xAI2M ctx1.8s TTFT379 tps

xAI's newest reasoning model with a 2M token context window. 1.8s time-to-first-token, 379 tps throughput - strong chain-of-thought for complex tasks.

Performance
Coding
82
Reasoning
88
Instruction
79
Speed
45
Cost eff.
68
Cost$2 in / $6 out per 1M
Best for
2M contextChain-of-thoughtMath & scienceComplex tasks
#6

o4-mini

TextCode
OpenAI128K ctx

Compact reasoning model from OpenAI. Strong reasoning at fraction of o3 cost - excellent coding and math.

Performance
Coding
80
Reasoning
86
Instruction
76
Speed
62
Cost eff.
82
Cost$1.1 in / $4.4 out per 1M
Best for
Reasoning valueCoding tasksMathCost-efficient
#7

Grok 4.20

Text
xAI2M ctx0.2s TTFT

xAI's latest non-reasoning model with a 2M context window. 0.2s latency for snappy responses at scale.

Performance
Coding
78
Reasoning
82
Instruction
80
Speed
85
Cost eff.
68
Cost$2 in / $6 out per 1M
Best for
2M contextLow latencyBroad capabilityxAI platform
#8

Grok 4.20 Multi-Agent

Text
xAI2M ctx1.9s TTFT3097 tps

xAI's multi-agent optimized variant with extraordinary 3097 tps throughput and a 2M token context. Built for parallelized agentic pipelines.

Performance
Coding
78
Reasoning
80
Instruction
81
Speed
93
Cost eff.
68
Cost$2 in / $6 out per 1M
Best for
3097 tpsMulti-agent2M contextPipeline optimized
#9

DeepSeek R1

🔓 Open SourceTextCode
DeepSeek128K ctx

Open-source reasoning model rivaling o1. Transparent chain-of-thought reasoning, freely deployable.

Performance
Coding
79
Reasoning
88
Instruction
71
Speed
42
Cost eff.
95
Cost$0.55 in / $2.19 out per 1M
Best for
Open-source reasoningMath/scienceTransparent CoTSelf-hostable
#10

GPT-4.5

Text
OpenAI128K ctx

OpenAI's largest non-reasoning model. Improved naturalness, nuanced coding assistance, and broad knowledge. Strong general-purpose workhorse for complex prompts.

Performance
Coding
74
Reasoning
77
Instruction
82
Speed
62
Cost eff.
12
Cost$75 in / $150 out per 1M
Best for
Instruction followingNuanced reasoningBroad knowledgeCoding
#11

DeepSeek V3

Open SourceTextCode
DeepSeek128K ctx

Chinese research lab's open-source flagship. Extraordinary coding ability at an unbelievably low price point.

Performance
Coding
80
Reasoning
76
Instruction
77
Speed
73
Cost eff.
97
Cost$0.27 in / $1.1 out per 1M
Best for
Best price/performanceCodingOpen weightsStrong benchmarks
#12

GPT-4o

TextImage
OpenAI128K ctx

OpenAI's versatile multimodal model. Strong across text, vision, audio with fast response times and broad capability.

Performance
Coding
76
Reasoning
74
Instruction
82
Speed
82
Cost eff.
74
Cost$2.5 in / $10 out per 1M
Best for
MultimodalFunction callingSpeedBroad knowledge