Automated program repair · Unity game bugs
Leaderboard
SWE-game-bench evaluates AI coding systems on real software issues from Unity game repositories. Each issue ships with an injected Unity test that decides whether a generated patch passes.
| # | System | Model | Config |
Score
Composite score = parseable@10 × filehit@10 × pass@10. Values range from 0 to 1.
|
Parseable@10 | File hit@10 |
pass@10
Expected success rate when sampling up to 10 attempts.
|
Logs |
|---|
Ranked by Score = parseable@10 × filehit@10 × pass@10. Incomplete runs rank below complete runs.
Tap a row for Model, Config, parseability, file-hit rate and published logs. Score is the product of parseable@10, filehit@10 and pass@10.