Cue the maths: predicting snooker's next champion with Elo

Elo, part 2: How maths, models and millions of simulations might tell us who lifts the trophy

Welcome to the Crucible

This blog is usually all about board games, but let’s stretch the definition just a little: snooker is, after all, one of the most widely followed tabletop games in the world. And with the World Championship kicking off at the Crucible Theatre in Sheffield, I couldn’t resist the excuse to dive into something a bit different.

In the last article, we looked at how Elo ratings can be used to measure player strength over time. This time, we’ll take it a step further: using historical match data, a bit of Python, and a lot of simulated tournaments, we’ll try to predict who’s most likely to lift the trophy this year. We’ll also compare our predictions to what the betting markets say – and see whether the wisdom of the crowd agrees with the cold logic of the model.

Building the model: Elo meets the baize

Decades of data

So, let’s break off and finally calculate some Elo ratings. For this, snooker.org kindly provided data of 68,260 matches from 2,163 events contested by 4,212 players, ranging from 1975 till last Wednesday, via their API. I’ve included as many matches as I could find, regardless of tour, ranking status or eligible player group, as long as they weren’t team matches nor had any kind of inconsistency. For Elo calculations, it’s important to sequence matches correctly, and some matches in the database weren’t correctly labelled, but I did my best to get as clear data as possible. Note that I did not take frame score into account, but only cared about win/loss: since the match is stopped after a player reached the winning score and dead frames aren’t played out, the exact scoreline has little bearing on predictions.1

How Elo predicts the winners

To recap the actual calculations: all players start at an Elo rating of 0. (As mentioned before, it really could be any value, but we’ll stick with the simplest one.) Using the ratings \(r_A\) and \(r_B\) before the match, we can predict A’s win probability \(p_A\) like this:

\[ p_A = \frac{1}{1 + 10^{-(r_A - r_B) / 400}}. \]

As usual, we can calculate B’s chances via \(p_B=1-p_A\), so we won’t need to worry much about that. Once the match is done, we can compare that prediction with the actual outcome \(s_A\), where we score a win as 1 and a loss as 0. We then update A’s rating:

\[ r_A \leftarrow r_A + K (s_A - p_A), \]

where \(K\) is the update factor I’ve set to 42 for the purpose of this exercise since it’s the value that yields the most accurate predictions.2 (Much more on this in the next article.)

Match by match: how ratings shift

Let’s look at some examples. Before the very first match in the database, Ray Reardon vs John Spencer on 1975-01-17, we didn’t know anything about any player, so they all had the initial rating of 0. If you plug a rating difference of 0 into the formula, you’ll see that we predict even chances of winning for both players (which makes perfect sense). John Spencer won that match, so we updated

\[ r_{\text{JS}} \leftarrow 0 + 42 \cdot (1 - 0.5) = 21. \]

His opponent got his rating reduced by the same amount3: \(r_{\text{RR}}\leftarrow-21\). I wrote a simple Python script to carry out these calculations for all 68,259 matches that followed.

Let’s take a look at one more match: the final of the most recent tournament, the 2025 Tour Championship, played between snooker legends John Higgins and Mark Selby, both with four world titles to their name. They went into the match with Elo ratings of 718.3 and 714.5, respectively. This means we would’ve predicted Higgins’ win probability to be 50.5%. The match was indeed won by John Higgins, who gained \(42\cdot(1-0.505)=20.8\) points, whilst Mark Selby lost the same amount, for a new (and current) rating of 739.0 and 693.7, respectively.

Who’s on top? Elo’s current kings

As mentioned, my code diligently carried out the Elo predictions and updates for every single match from 1975 till the 2025 World Championship Qualifiers earlier this week. These are the ten currently highest rated player:

RankNameEloMatchesFirst match
1John Higgins739.014361992-10-25
2Judd Trump709.214072005-11-03
3Kyren Wilson695.910552010-06-27
4Mark Selby693.714461999-10-24
5Zhao Xintong687.64242012-06-18
6Joe O’Connor646.73672012-09-05
7Barry Hawkins641.111531997-03-25
8Neil Robertson624.112361999-03-20
9Shaun Murphy606.912252001-02-11
10Ali Carter605.211941997-03-25

By winning the 2025 Tour Championship, John Higgins claimed back the top spot he first held after winning his maiden world title in 1998. The list mostly contains consistently successful players over the past decades, as well as more recently rising stars like Zhao Xintong and Joe O’Connor.

Rising stars and fading legends

It’s fun to look back in time and check how players’ ratings evolved over time: Who was the highest rated player of his time? When did his ratings rise and fall? Let’s take a look across the decades:

The evolution of the Elo ratings of the best three players of the 1980's

The 1980’s we undoubtably Steve Davis’ years, who won six world championships. The only way was up for him. It’s also noteworthy how teenager Stephen Hendry rose to prominence.

The evolution of the Elo ratings of the best three players of the 1990's

Just as Steve Davis dominated the 80’s, Stephen Hendry dominated the next decade, winning seven of the ten world championships. The plot also quite clearly shows the change in generations when John Higgins, the first of the “Class of ‘92”, overtook Stephen Hendry in 1998.

The evolution of the Elo ratings of the best three players of the 2000's

The first decade in the new millennium was strongly influenced by the other two members of the “Class of ‘92”: Ronnie O’Sullivan and Mark Williams.

The evolution of the Elo ratings of the best three players of the 2010's

Maybe the first thing you’ll notice in the 2010’s is how much more often the ratings change, reflecting a notable increase in tournaments. Mark Selby won three world titles in this decade.

The evolution of the Elo ratings of the best three players of the 2020's

Judd Trump won more titles than any other player in recent years, reaching an all time high Elo rating of 837 in February 2021, though notably he’s been struggling with the long distances during the World Championship, “only” winning one so far in 2019.

You might also notice how Ronnie O’Sullivan sharply dropped recently in his rating. This is because he lost a couple of matches in his last tournament and then withdrew from the next, which are counted as losses in my code. (If you’ve ever played games online you’ll agree that players who rage quit should still suffer the full Elo penalty.) Still, he’s the most successful player of all times and has been rated highest for a total of 9.5 years, longer than anybody else.

Here’s the top 10 of the players who spent the most time at the top spot:

NameMonthsFirstLast
Ronnie O’Sullivan1141998-03-012024-02-01
Steve Davis971981-05-011990-10-01
Stephen Hendry911990-11-012006-01-01
John Higgins611998-06-012025-05-01
Ray Reardon441976-02-011983-05-01
Judd Trump442012-01-012025-04-01
Mark Williams342000-06-012021-11-01
John Spencer241975-02-011978-04-01
Neil Robertson182011-11-012023-01-01
Cliff Thorburn101985-03-011986-10-01

Obviously, it was much easier to remain highest rated for a long time when there were much fewer events per year, but I do believe this list is a pretty representative hall of fame of snooker players.

One final comment on the historical view of Elo ratings: you might have noticed that the values have generally increased over time. There’s a number of factors at play – mostly the fact that there are a lot more players and matches these days, which give the top players more opportunities to collect points from weaker opponents. Remember that we chose \(K\) such that it the Elo system would have the best predictive power. Since the vast majority of the matches in the dataset were played in the last two decades or so, the ratings are tuned with a strong recency bias. Interpret historical Elo ratings with caution and remember that Elo is most descriptive within an active community of players.

10 Million Tournaments Later…

I thought it would be a fun application to use those Elo ratings we calculated to predict who will win the current World Championship. For this, I’ve run a bunch of simulated tournaments. The idea is quite simple: for each of the first round pairings in the draw, I compare the current Elo ratings of those two players, convert them into a win probability as described above, and toss a virtual coin in order to determine who proceeds to the second round. We apply the same principle to that and all the following rounds, until the final coin for the final is tossed and the winner of that simulation run is determined. I’ve run a total of 10 million simulated tournaments and counted how often each player won a simulation. Here are the results:

PlayerEloSimulation probability
John Higgins739.113.14%
Judd Trump709.312.63%
Mark Selby693.812.59%
Zhao Xintong687.710.58%
Kyren Wilson695.99.56%
Ali Carter605.35.20%
Neil Robertson624.24.74%
Barry Hawkins641.24.45%
Shaun Murphy607.03.52%
Joe O’Connor646.83.51%
Xiao Guodong601.52.27%
Hossein Vafaei600.42.26%
Wu Yize579.12.05%
David Gilbert544.41.53%
Zak Surety552.31.45%
Mark Allen551.21.39%
Ding Junhui549.31.36%
Pang Junxu528.01.34%
Ryan Day535.81.11%
Luca Brecel525.00.90%
Zhou Yuelong539.70.82%
Matthew Selt545.60.79%
Lei Peifan534.10.64%
Fan Zhengyi508.50.60%
Ben Woollaston490.10.42%
Chris Wakelin476.10.30%
Mark Williams470.70.23%
Si Jiahui445.50.21%
Daniel Wells458.30.19%
Zhang Anda427.40.17%
Jak Jones379.30.03%
Ronnie O’Sullivan345.20.03%

Unsurprisingly, the order strongly correlated with the Elo ranking we’ve seen above, but it’s not quite the same. E.g., defending champion Kyren Wilson is the third highest rated player according to Elo, but only fifth favourite to win the tournament again. How come? The answer is that there are easier and harder paths to the final. Kyren Wilson faces the relatively highly rated Lei Peifan in the first round, whilst the two players which overtook him, Mark Selby and Zhao Xintong, face the relatively lowly rated Ben Woollaston and Jak Jones, respectively.

Let’s make some money with this knowledge, shall we? πŸ’Έ

How does the market compare? Bookies vs model

Disclaimer: This section discusses betting odds for the purpose of statistical comparison and analysis. It is not intended to promote gambling or serve as betting advice. Please gamble responsibly and be aware of your local laws and age restrictions.

I’m personally not the gambling kind and wouldn’t advice you to pick up that addictive habit either. But sport bets are an undeniably interesting data source for predictions, as you really need to put your money where your mouth is. That’s why you often see betting odds discussed ahead of big sporting events: there’s few models with comparable predictive accuracy. So I thought it’s an interesting reality check for those simulations to compare those results to what gamblers are willing to bet on the snooker stars.

First, we need briefly discuss how to convert those probabilities to odds. Let’s say I offer you the following gamble: you pay me €1, then I toss a (fair) coin. Heads: I pay you back €2; tails: I keep your money. You probably intuitively know that the expected payout is €1, so you can take or leave the bet and wouldn’t be better or worse off for it. Had I offered you a potential win of €2.10, you actually should take the bet; for a €1.90 stake you should definitely pass.

The same basic idea applies to the odds quoted4 in sport betting: The broker will quote odds like 5, meaning I could win €5 if I bet €1 (for a potential gain of €4). If I believe the event will occur with a 20% probability, my expected payout is exactly 1 – I should only take the bet if my belief in that event is higher (if you must take the bet at all). In other words: in order to convert between probabilities and odds, you just take the reciprocal. E.g., the win probability of 13.14% for John Higgins corresponds to odds of \(1/0.1314=7.61\), i.e., I’d expect to make money if someone offered longer odds and might be inclined to take the bet.

So, I’ve taken a look at oddschecker.com πŸ—„οΈ to see what odds different brokers offer for different players to win the World Championship. These are the odds5 as offered and how they compare to the odds implied by our simulations:

PlayerSimulation oddsBetting oddsDifference
John Higgins7.6116.208.59
Judd Trump7.925.12-2.80
Mark Selby7.946.50-1.44
Zhao Xintong9.4517.007.55
Kyren Wilson10.4610.30-0.16
Ali Carter19.24137.00117.76
Neil Robertson21.0817.00-4.08
Barry Hawkins22.4829.006.52
Shaun Murphy28.4126.00-2.41
Joe O’Connor28.4869.0040.52
Xiao Guodong43.9888.0044.02
Hossein Vafaei44.24225.00180.76
Wu Yize48.8464.0015.16
David Gilbert65.32167.00101.68
Zak Surety69.14265.00195.86
Mark Allen72.0225.00-47.02
Ding Junhui73.6241.00-32.62
Pang Junxu74.79126.0051.21
Ryan Day89.73490.00400.27
Luca Brecel111.2454.00-57.24
Zhou Yuelong121.72314.00192.28
Matthew Selt126.09598.00471.91
Lei Peifan156.10470.00313.90
Fan Zhengyi166.10323.00156.90
Ben Woollaston237.23843.00605.77
Chris Wakelin337.92235.00-102.92
Mark Williams436.8064.00-372.80
Si Jiahui485.6054.00-431.60
Daniel Wells516.10980.00463.90
Zhang Anda587.9683.00-504.96
Jak Jones3642.99127.00-3515.99
Ronnie O’Sullivan3703.708.50-3695.20

I’d say, by and large those match quite well. Clearly, “the market” believes more in some players than our model, but the ballpark is right, at least for the highest rated players. (Low probabilities correspond to huge odds, which brokers rarely are willing to offer.) The most notable exception is Ronnie O’Sullivan: as discussed before, our model rates him lowest amongst all participants, but he’s still one of the favourites to win according to the bookmakers – even though he hadn’t even confirmed his participation until a day before the opening match.

I wouldn’t bet my money on him, but as I said: I’m not the gambling kind. 🀷

Final frame 🎱

I hope I made up for the absence of concrete examples from the last article. I’m certainly more excited than ever for 17 days of world class snooker. We’ll see on May 5th if my predictions were worth anything.

The next Elo article will finally drill deeper into the often teased nuances of choosing \(K\) correctly. We’ll then use this knowledge to exactly quantify what’s the ratio between luck and skills in games. It’ll get properly scientific! πŸ§‘β€πŸ”¬

As always, you can find all the code for the simulations etc from GitLab.


  1. There’s a number of different decisions one needs to make for such a system, e.g., what matches to include, how to score them, the ubiquitous choice of \(K\), etc. If you want to compare my results with a different approach, take a look at snooker-predictions.com↩︎

  2. Seriously, it’s not a Douglas Adams reference. The maths just worked out that way. ↩︎

  3. Because we applied Elo’s formula in its simplest form, all updates will be zero-sum, and the overall average Elo rating will stay fixed at 0. ↩︎

  4. There are different ways to quote odds. The one I’m using for this article is called the decimal or European style, which most easily translates to probabilities. The fractional or British style (which is more common in snooker bets for obvious reasons) quotes the potential win as a fraction. E.g., decimal odds of 5.00 would be quoted as 4/1 (or simply 4) in fractional style. ↩︎

  5. Note that I’ve only used the highest odds offered by any broker. If you were to place a bet, you’d always want to go with the provider who offers you the highest payout, so that number is the most relevant. It’s also worth pointing out that when you sum up the probabilities implied by the odds, they will usually exceed 100%. That’s because the odds are slightly shorter than they should be because the brooker wants their cut (also know as vigorish) too. Remember: the house always wins. ↩︎


See also