Flagship leaderboard

AI Coding Leaderboard

Programming and agentic coding benchmarks for frontier language models.

Updated 368d ago · Rankings show governed observations with source provenance.


1	OpenAI o3 OpenAI		69.1	—	—	37.1	—	—	80.8	81.3

Sort

1 option

OpenAI

Customize columns (up to 25) and hide options to focus on the measures that matter. Browse all rankings