Loading...
The past seven days delivered more significant AI coding developments than most quarters used to produce. If you're a technical leader trying to make sense of where this space is heading, here's what actually matters from this week's chaos.
Claude Opus 4.5 launched November 24 and immediately became the new standard for coding tasks. The model hit 80.9% on SWE-bench Verified—the first to break 80% on this benchmark that actually matters. GPT-5.1 sits at 76.3%, Gemini 3 Pro at 76.2%.
More important than the benchmark: 67% price reduction. Now $5 per million input token...
80.9%
SWE-bench Performance
Note: This analysis was compiled by AI Power Rankings based on publicly available information. Metrics and insights are extracted to provide quantitative context for tracking AI tool developments.