Agents Directory
SkillsRankingsAgents
CategoriesModelsBenchmarksCompareAgent LeaderboardSkillsRankingsAgentsAbout
/Benchmarks
/CursorBench 3.1
Cursor

CursorBench 3.1

Coding

Ambiguous, multi-file tasks from real Cursor sessions that test codebase understanding, bugfinding, planning, and code review.

Official source
Score vs. cost
Leaderboard
#ModelScoreCostTokensSteps
  • 1
    ClaudeClaude Fable 5Max
    72.9%$18.0263,84276
  • 2
    ClaudeClaude Fable 5Extra High
    72%$13.7448,75463
  • 3
    ClaudeClaude Fable 5High
    70.6%$10.8137,17354
  • 4
    ClaudeClaude Fable 5Medium
    69.8%$8.2728,50747
  • 5
    ClaudeClaude Opus 4.7Max
    64.8%$11.0262,98996
  • 6
    OpenAIGPT-5.5Extra High
    64.3%$4.3717,90546
  • 7
    ClaudeClaude Fable 5Low
    64.2%$5.7018,88236
  • 8
    ClaudeClaude Opus 4.8Max
    63.8%$7.5977,37060
  • 9
    CursorComposer 2.5
    63.2%$0.5515,15237
  • 10
    OpenAIGPT-5.5High
    62.6%$3.5913,32940
  • 11
    ClaudeClaude Opus 4.8Extra High
    62.1%$6.1455,62254
  • 12
    ClaudeClaude Opus 4.7Extra High
    61.6%$7.1143,94272
  • 13
    ClaudeClaude Opus 4.7High
    59.4%$5.0132,22759
  • 14
    OpenAIGPT-5.5Medium
    59.2%$2.229,06535
  • 15
    ClaudeClaude Opus 4.8High
    58.4%$4.4136,78845
  • 16
    ClaudeClaude Opus 4.8Medium
    56.6%$3.8331,68441
  • 17
    ClaudeClaude Opus 4.8Low
    54.3%$2.9322,72636
  • 18
    ClaudeClaude Opus 4.7Medium
    52.7%$2.9319,19341
  • 19
    CursorComposer 2
    52.2%$0.5614,16340
  • 20
    GeminiGemini 3.5 Flash
    49.8%$1.9435,10579
  • 21
    ClaudeClaude Sonnet 4.6Max
    49%$3.0940,28055
  • 22
    ClaudeClaude Sonnet 4.6High
    48.8%$3.0637,35257
  • 23
    OpenAIGPT-5.5Low
    48.8%$1.194,92324
  • 24
    ClaudeClaude Opus 4.7Low
    48.3%$1.8713,16429
  • 25
    MoonshotAIKimi K2.6
    47.6%$1.2724,78356
  • 26
    ClaudeClaude Sonnet 4.6Medium
    46%$2.6431,36050
  • 27
    ClaudeClaude Sonnet 4.6Low
    41.5%$1.8921,21150
  • 28
    MoonshotAIKimi K2.5
    31.9%$0.879,44630
Sources:
CursorBench (cursor.com)
Used in rankings:
Best Chinese AI modelsBest free coding models on OpenRouterBest free models for HermesBest free models for OpenClawBest free models on OpenRouterBest models for HermesBest open-source AI modelsBest open-source models for HermesBest open-source models for OpenClaw
Share:
Details:
  • Category


    Coding
  • CursorCreated by


    Cursor
  • Models tested


    10
  • Configs tested


    28
  • Leader


    ClaudeClaude Fable 5
  • Top score


    72.9%

Updated June 2026

Browse:SkillsRankingsModelsBenchmarksProvidersAgentsAgent LeaderboardCompareCategories
Quick Links:AboutBlog

© 2026 Agents Directory