Our General Election model is based on odds given by bookies (namely Ladbrokes) for each constituency, which are turned into percentages and used as the basis for random generations.
This is not the first election in which these odds have been available; the run up to the 2010 election was where Ladbrokes premiered their constituency-level odds. Therefore, having obtained some of these past odds we can look to see what can be learned about our current model.
We can see that the TGIAF predictions for each party are either entirely above or entirely below the other predictions. This is not necessarily a major concern, as most other methods simply look at their favourite for a seat, and say they will win it, whereas we award each party with their probability of winning. So, for example, if the a seat has a 60% chance of being won by Party A, and a 40% chance of being won by Party B, then 0.6 and 0.4 are added to our expected seat total for Parties A & B respectively. This is different to the others, so a little difference is to be expected.
However, looking at the seat predictions for UKIP, the TGIAF value of 29.2 is vastly greater than those of the others, with the second-highest being the Guardian's prediction of 4. It seems that our method is over-rewarding those parties in second-place in constituencies, at the expense of the favourites. This would explain why the Liberal Democrats are also a bit higher than expected, and why the SNP (who are favourites in the majority of Scottish seats) are under-valued, along with Labour and the Conservatives.
Therefore, we need to look at how the predicted probabilities from the 2010 election transferred into actual likelihoods of a side winning a seat, in order to see how our model can be transformed.
Unfortunately, the full list of odds ahead of the 2010 vote aren't available (if you do know of a source, please feel free to get in touch). However, we were lucky enough to find a list of the top 200 target seats for the Conservatives, and their respective odds (see Sources).
However, this is just one odd for a given seat, our method relies on knowing all odds for a constituency in order to work out probabilities. However, we were able to easily use our 2015 data to turn these odds into percentages. To see the method behind this, see Appendix.
We can then group these odds into ten bands - those with 0-10% chance of winning, those with 10-20% chance of winning, and so on. Within each band we see what proportion of the Conservative candidates actually won. We can see the results on the graph below:
Now we need to work out how we can correct this within our model.
Transforming the data
Our current method simply involves flipping the odds upside-down into their pay-offs using the following formula:
We cannot use the 2010 Conservative data as a cast-iron measure of how the relationship works, as it is too small a sample. But we know that the values should follow some relationship where the favourites become more, for want of a better word, favouritier, and so on. You can use the widget below to experiment with different values yourself, in order to see the result on the graph. A value of ϕ=1 will return the original values, anything lower will start to level the playing field, whilst a negative value will reverse the chances of victory in favour of the outsiders. Values too high will just polarise the results such that all favourites have a 100% chance of winning. Therefore, we recommend using a range for ϕ between 1.0 and 3.0.
May2015 - used for its listings of various seat predictions.
Political Betting - for its list of 200 Conservative seat odds from 2010.
Appendix - turning individual odds into percentages
Our challenge is to turn a single odd into the relevant percentage. For example, the odds of the Conservatives winning in Bath was given as evens, or 1/1. If bookies did not shorten their odds to guarantee a profit, then this would be equal to a 50% chance (think of it like a game where you toss a coin, and if it's heads you win £1, if it's tails you lose £1). However, we know they do alter odds, so we need to find a way to turn this into its "true" percentage.
All we do is look through our vast data for the current election, and find the average win likelihood of all candidates with the same original odd. For a runner with odds of 1/1, this percentage happens to be 45.7%. The relationship between odds and the likelihood of winning is shown below.
Whilst our fundamental model is incredibly simplistic, if we want to fine-tune it to produce more conventional results, an awful amount of tinkering needs to take place.
If we had the full set of odds for 2010, we would be able to get a much better estimation of how to alter our model. Instead, we have to rely on subjective impressions.
Next time we will be using this altered model to give our latest predictions for each party.