by Adrian Worton
Introduction
Over the course of the summer, we introduced mathematical modelling to our analysis of sports. In particular, our World Cup simulator was based around bookies' odds, and running this simulator a thousand times gave us very strong ideas of the "true" probabilities of various outcomes.
Introduction
Over the course of the summer, we introduced mathematical modelling to our analysis of sports. In particular, our World Cup simulator was based around bookies' odds, and running this simulator a thousand times gave us very strong ideas of the "true" probabilities of various outcomes.
In a marked change for this site, we will turn this approach away from sports, and look at the upcoming General Election. This is the ideal place for this type of modelling, because the overall outcome of the election is decided by the total number of seats each party accumulates. The outcome in one seat is not influenced by the outcome in another, which means our modelling can assume each seat is an independent event.
In this article, we will explain the method used in the creation of the simulator, an interactive version of which is available on this page.
Modelling each seat
Luckily for us, odds for each constituency in Britain are available online (see Sources). We will use the constituency of Blyth Valley as our example (one of three I could claim as my home constituency). The odds for Blyth Valley are as follows:
From this part onwards, our conversion of odds into probabilities follows that outlined in this article on our Premiership simulator. Therefore, if you are familiar with that, skip ahead to the next section.
Let us convert our odds from fraction into decimal form. So Labour's 1/50 will become 0.02. We want values which are larger for parties with more chance of winning. So for each party we calculate 1/[odds+1]. Explanation of this is available in Appendix 2 of the aforementioned article.
We now have values of 0.980 for Labour, 0.059 for UKIP and 0.020 for the Lib Dems. These sum to 1.059, so are not quite ready to be taken as probabilities. Clearly, we want to reduce our total by 5.9%. Therefore we simply reduce all our three values by 5.9% (which equates to multiplying by 0.944...). We get resulting values of:
This method of turning odds into probabilities is applied to every constituency, and each simulation brings a new set of results based on these probabilities.
Northern Ireland
As previously mentioned, we only have odds for the seats in Britain - not for those in Northern Ireland. This does not pose too much issue to our model, as it is not at all likely that any constituency in Northern Ireland will return an MP from any of the national parties. Therefore we will treat all 18 constituencies as returning a "NIR" MP in our results.
However, it is important to model non-national parties Plaid Cymru and The SNP, as they will be in competition with the national parties for their seats, so their success will impact the overall race to Number 10.
The overall model
The full model can be viewed here. Our results are simply taken from summing the number of seats each party wins.
The huge strength of this approach to modelling the election is that it acknowledges all runners in a constituency. There are some approaches which just take the most likely candidate for each seat according to their calculation method and predict that they will win, which is dangerous territory.
Suppose there are two constituencies, and each is 60% likely to have a Conservative win, and 40% likely to have a Labour win. These methods will predict 2 Conservative wins. However, the chances of this happening are only 36% (60% x 60%). There would be a 48% chance of one MP from each party, and a 16% chance of two Labour MPs. Our method takes such probabilities into account.
A downside to this method is that it is entirely reliant on the odds provided by bookmakers. However, bookmakers are arguably the most reliable people when it comes to votes, since their motive is to make money, and the best way for them to do so is to have odds which reflect real-life probabilities, since otherwise there will be a way for punters to exploit them to make money. Indeed, ahead of the Scottish referendum, whilst newspapers were carrying stories of the Yes camp taking the lead (because drama increases sales, and their motive is also to make money), the odds remained relatively stable as second-favourite, as shown here. Furthermore, as we found out with our World Cup simulator, using bookies' odds within our model opens up the possibility of finding areas where odds are under- or over-priced, making money-making a possibility.
Next time we will look at the results from our model in order to make predictions for each party.
In this article, we will explain the method used in the creation of the simulator, an interactive version of which is available on this page.
Modelling each seat
Luckily for us, odds for each constituency in Britain are available online (see Sources). We will use the constituency of Blyth Valley as our example (one of three I could claim as my home constituency). The odds for Blyth Valley are as follows:
- Labour: 1/50
- UKIP: 16/1
- Liberal Democrats: 50/1
- Conservatives: 100/1
From this part onwards, our conversion of odds into probabilities follows that outlined in this article on our Premiership simulator. Therefore, if you are familiar with that, skip ahead to the next section.
Let us convert our odds from fraction into decimal form. So Labour's 1/50 will become 0.02. We want values which are larger for parties with more chance of winning. So for each party we calculate 1/[odds+1]. Explanation of this is available in Appendix 2 of the aforementioned article.
We now have values of 0.980 for Labour, 0.059 for UKIP and 0.020 for the Lib Dems. These sum to 1.059, so are not quite ready to be taken as probabilities. Clearly, we want to reduce our total by 5.9%. Therefore we simply reduce all our three values by 5.9% (which equates to multiplying by 0.944...). We get resulting values of:
- Labour: 0.926 (92.6%)
- UKIP: 0.056 (5.6%)
- Liberal Democrats: 0.019 (1.9%)
This method of turning odds into probabilities is applied to every constituency, and each simulation brings a new set of results based on these probabilities.
Northern Ireland
As previously mentioned, we only have odds for the seats in Britain - not for those in Northern Ireland. This does not pose too much issue to our model, as it is not at all likely that any constituency in Northern Ireland will return an MP from any of the national parties. Therefore we will treat all 18 constituencies as returning a "NIR" MP in our results.
However, it is important to model non-national parties Plaid Cymru and The SNP, as they will be in competition with the national parties for their seats, so their success will impact the overall race to Number 10.
The overall model
The full model can be viewed here. Our results are simply taken from summing the number of seats each party wins.
The huge strength of this approach to modelling the election is that it acknowledges all runners in a constituency. There are some approaches which just take the most likely candidate for each seat according to their calculation method and predict that they will win, which is dangerous territory.
Suppose there are two constituencies, and each is 60% likely to have a Conservative win, and 40% likely to have a Labour win. These methods will predict 2 Conservative wins. However, the chances of this happening are only 36% (60% x 60%). There would be a 48% chance of one MP from each party, and a 16% chance of two Labour MPs. Our method takes such probabilities into account.
A downside to this method is that it is entirely reliant on the odds provided by bookmakers. However, bookmakers are arguably the most reliable people when it comes to votes, since their motive is to make money, and the best way for them to do so is to have odds which reflect real-life probabilities, since otherwise there will be a way for punters to exploit them to make money. Indeed, ahead of the Scottish referendum, whilst newspapers were carrying stories of the Yes camp taking the lead (because drama increases sales, and their motive is also to make money), the odds remained relatively stable as second-favourite, as shown here. Furthermore, as we found out with our World Cup simulator, using bookies' odds within our model opens up the possibility of finding areas where odds are under- or over-priced, making money-making a possibility.
Next time we will look at the results from our model in order to make predictions for each party.
Sources
Ladbrokes - provide odds on every constituency in Britain.
Electoral Calculus - used to assign regions to each constituency.
Ladbrokes - provide odds on every constituency in Britain.
Electoral Calculus - used to assign regions to each constituency.
General Election articles
Next: TGIAF General Election predictions
Next: TGIAF General Election predictions