The Ultra-Bullet Evolution
A master class analysis of 24 engines competing under extreme time pressure — 9s+0.1s increment, single thread, 50 rounds
Intel Core i5-6500 · 16 GB RAM
346 / 1120 games played (30.8%)
Blitz 9s + 0.1s increment
GUI: Arena Chess
Tournament leader
SF 18dev
Elo 2909.6 · 66.2% score
+29.6 Elo
Biggest surprise
Reckless 0.10d
Rank #3 · above SF 17 & 17.1
59.4% score
Biggest collapse
RubiChess
Rank #24 · 35.8% score
-35.8 Elo
Most consistent
Caissa 1.25
Perfectly neutral across all blocks
+0.3 Elo
Full standings — all 24 engines
| # |
Engine |
Points |
Games |
Elo |
± Elo |
Score % |
Elo bar |
Trend |
Score % per 8-round block — top movers
SF 18dev
Reckless 0.10.0d
SF 16.1 (recovery)
RubiChess (collapse)
Elo gain/loss per 8-round block — selected engines
| Engine |
R 1–8 | R 9–16 | R 17–24 |
R 25–32 | R 33–40 | R 41–50 |
Total |
| SF 18dev SF | +4.1 | +8.6 | +9.1 | +5.7 | +4.4 | +7.5 | +39.4 |
| SF 18 SF | +5.3 | +7.0 | +7.8 | +4.9 | +1.2 | +4.7 | +30.9 |
| Reckless 0.10.0d RC | +2.4 | +5.9 | +4.3 | +8.8 | +2.4 | +7.3 | +31.1 |
| SF 17 SF | +2.5 | -0.8 | +3.4 | +0.3 | +8.2 | +2.9 | +16.5 |
| PlentyChess 7.0.37 PC | +5.2 | +6.0 | +3.7 | +2.1 | +2.4 | +4.3 | +23.7 |
| SF 17.1 SF | +2.5 | +3.7 | +1.0 | -2.4 | +1.0 | +3.2 | +9.0 |
| Reckless 0.9.0 RC | +1.4 | +2.5 | -0.9 | +4.6 | +3.2 | -3.3 | +7.5 |
| SF 16.1 SF | -5.0 | +0.4 | +0.6 | +1.7 | +3.0 | +5.0 | +5.7 |
| Komodo Dragon 3.3 KD | -7.5 | -6.4 | -7.4 | -4.3 | -6.2 | -4.9 | -36.7 |
| RubiChess RU | -6.5 | -7.9 | -9.6 | -10.4 | -2.5 | -8.4 | -45.3 |
Key findings
Reckless in ultra-bullet
Reckless 0.10.0d sits only 1.5 points behind SF 18dev while playing 100 more games. A 59.4% score at 9 seconds against this full field suggests its evaluation function is extraordinarily time-efficient — extracting more value per node than any other non-Stockfish engine here.
SF 16.1 — the comeback story
Started catastrophically at -5.0 Elo in R1–8, then recovered block by block, reaching +5.0 in R41–50. The most dramatic recovery arc in the tournament. The engine appears to "warm up" as its Elo settles against the field — a calibration effect.
Single-thread penalty
Komodo Dragon (-29.6) and RubiChess (-35.8) were architecturally designed for multi-thread environments. Stripped of parallelism at 9 seconds, their MCTS and deep-search approaches cannot function correctly. This tournament exposes that dependency brutally.
Points vs Elo — the illusion
PlentyChess 7.0.37 holds 617.0 raw points — more than anyone — yet ranks #5 by Elo. It plays 1,050 games vs 900 for Stockfish versions. Raw points are meaningless across unequal game counts; Score% and Elo are the only honest metrics.
Anomaly: Alexandria 9.0.0 ranks below 8.1.12
In a well-functioning version series, a newer version should outperform the older one. Here, Alexandria 9.0.0 (Elo 2816.8, -9.2) ranks below Alexandria 8.1.12 (Elo 2822.3, -16.6). Under single-thread ultra-bullet conditions, the architectural changes in 9.0.0 appear counter-productive — a probable speed/quality trade-off that does not suit 9-second time controls.
Final standings projection
Tournament winner
Stockfish 18dev
Positive Elo gain in every single block — no signs of slowing. Barring a structural change, this engine finishes first.
Will hold rank #3
Reckless 0.10.0d
Consistently positive across all blocks, with a strong +7.3 in the last period. Could threaten rank #2 if SF 18 loses momentum.
Watch closely
Stockfish 16.1
Recovery arc is real and accelerating (+5.0 in R41–50). Likely to overtake SF 17.1 in final standings if the trend holds.
Bottom two locked
Komodo Dragon & RubiChess
No recovery signals in any block. RubiChess holds the tournament record for worst single block (-10.4 in R25–32). Both finish last.