If you are a regular reader of TGIAF, you will be aware that our most unique output is our simulators. Personally, I think the ones created for the General Elections have been the most powerful, but it all started four years ago with our 2014 World Cup simulator.
Well, it's time to pick your random numbers, because our World Cup simulator is back for Russia!
Individual matches
We will start by considering a single match, then broaden our scope until we've looked at the whole tournament.
To demonstrate how a match works, we'll use the example of Sweden vs. South Korea, shown in the gallery below. Numbers in square brackets below tell you which image in the gallery it is describing.
Firstly, we start with the average odds for each team (in decimal) taken from several bookmakers [1]. Each outcome's odd is transformed by the equation 1/(odds+1) to give an 'inverted odd' [2]. Each inverted odd is then divided by the total of the inverted odds. In the case of our example, the total is 1.066. This gives each outcome's probability. | |
A random number (between 0 and 1) is then generated. If it's below the first boundary then Team A have won. If it's between the first and second boundaries it's a draw. If it's above the second boundary then Team B have won.
So, for example if the random number for Sweden - South Korea was 0.578, then this is between the first two boundaries so the match is a draw. If it was 0.185 then it would be a Swedish win.
Groups
Using this method, it is not hard to scale up the process to cover a full group. By adding up each team's points accrued, we can create a table. Of course, some teams may be tied on points - because the simulator doesn't cover scorelines (just results), we randomly generate tied teams' order.
Generating odds
This is all well and good, but when it comes to the knockout stages we don't have the odds for that match, because it doesn't exist. So we have to create our own odds. We use the method first used for the previous World Cup. We will go into this in a little bit more detail later, so this is a quick summary.
Firstly, we need a measure of strength (which we'll call a coefficient) to compare the two teams. The most effective coefficient seems to be the sides' respective odds for winning the whole tournament. Regression was then used to find an equation to turn these coefficients into odds for the two teams.
Extra time & penalties
Note that our method for creating odds will result in a 'Team A'/'Draw'/'Team B' set of outcomes. So we will need to separate teams who draw.
We make the fairly lazy (but not outrageous) assumption that once a match is level after 90 minutes, the outcome is 50:50 between the two sides. Therefore, we generate (yet) another random number. If it's below 0.5 then Team A wins, if it's above 0.5 then Team B wins.
Full tournament
From here, it's just a case of filling in which teams go where once they've won.
Summary
The real hard work here is creating odds. I have noticed that since the last World Cup, my article on creating odds is by far the most-read on this website - somehow it comes up on the first page of Google results for "creating football odds". If only I'd plastered it in adverts...
Anyway, buy Gazprom.