by Adrian Worton
Following election defeat for Labour, the Liberal Democrats, and most importantly, those of us who tried to predict the election, we saw how our model was no different, in that our final predictions were way off.
However, we were highly encouraged by the clear need for election predictions to include ranges, in order to give an idea of the true area we might expect results to fall within.
Following election defeat for Labour, the Liberal Democrats, and most importantly, those of us who tried to predict the election, we saw how our model was no different, in that our final predictions were way off.
However, we were highly encouraged by the clear need for election predictions to include ranges, in order to give an idea of the true area we might expect results to fall within.
As this was a first attempt at modelling a General Election, analysing the results afterwards was always going to be a key feature of our work, as we look to make this model as functional as possible for the future. Therefore we now need to look back at the election results and see how our model could have best been configured.
Why Phi?
The key parameter in our model was ϕ (phi), which increased the chances of the favourite winning each seat as ϕ itself increased. For a fuller explanation of its role, see the article where we decided to use the value of 1.8 for it. Looking back at the election results, we saw that a lower value of ϕ was better for achieving the appropriate level of favourites winning seats, but a higher value led to overall seat predictions which were the most accurate.
Therefore, our job is to find the value of ϕ which best trades off these two factors. To do this we need to look at the spread of predictions that would have been made by different values and see which provide the best base.
Which value is best?
We will consider values between 1 and 2, and will be looking at the spread of results for the Conservatives, Labour, Liberal Democrats, the SNP and UKIP. The slideshow below shows the probability distributions for each party for varying values of ϕ. Use either the arrow keys or the thumbnails to navigate between graphs.
Why Phi?
The key parameter in our model was ϕ (phi), which increased the chances of the favourite winning each seat as ϕ itself increased. For a fuller explanation of its role, see the article where we decided to use the value of 1.8 for it. Looking back at the election results, we saw that a lower value of ϕ was better for achieving the appropriate level of favourites winning seats, but a higher value led to overall seat predictions which were the most accurate.
Therefore, our job is to find the value of ϕ which best trades off these two factors. To do this we need to look at the spread of predictions that would have been made by different values and see which provide the best base.
Which value is best?
We will consider values between 1 and 2, and will be looking at the spread of results for the Conservatives, Labour, Liberal Democrats, the SNP and UKIP. The slideshow below shows the probability distributions for each party for varying values of ϕ. Use either the arrow keys or the thumbnails to navigate between graphs.
The dashed line represents the seat total reached if ϕ continues to infinity - in other words, the number of seats where that party is favourite- and the solid line is the actual number of seats they achieved. Meanwhile, the stars represent predictions made by other sources.
Our selection of ϕ will once again have to be subjective. We are looking for a value which achieves the right mix of providing enough flexibility to allow a wide range of predictions, whilst bringing the peak of our predictions to approximately a similar level as other predictors, and indeed the actual number of seats.
With the exception of Labour, we see that lower values of ϕ put us on the wrong side of the predictions. Indeed, with UKIP we see that a low value of ϕ leads to horrendously inaccurate predictions. Therefore, anything below 1.6 we can rule out.
Using 2.0 appears to make the ranges too narrow, in particular the Liberal Democrat distribution. Therefore we are left with 1.6 and 1.8. A sensible compromise between our two aims is to use a value of 1.7 for ϕ.
Conclusion
This is only marginally different to the 1.8 we chose in early April (and there is a valid case for still using 1.8), which is a good vindication of our earlier work.
One thing that is incredibly apparent with all values of ϕ, and indeed with all predictions made elsewhere, is that none would have provided a range which included the final seat tally for the Conservatives, Labour or the Liberal Democrats. This emphasises how unlikely the actual result was, based on our understanding of the electorate before the vote.
We will therefore next be looking at how the model could further have been configured to get closer to estimating these unlikely scenarios.
Our selection of ϕ will once again have to be subjective. We are looking for a value which achieves the right mix of providing enough flexibility to allow a wide range of predictions, whilst bringing the peak of our predictions to approximately a similar level as other predictors, and indeed the actual number of seats.
With the exception of Labour, we see that lower values of ϕ put us on the wrong side of the predictions. Indeed, with UKIP we see that a low value of ϕ leads to horrendously inaccurate predictions. Therefore, anything below 1.6 we can rule out.
Using 2.0 appears to make the ranges too narrow, in particular the Liberal Democrat distribution. Therefore we are left with 1.6 and 1.8. A sensible compromise between our two aims is to use a value of 1.7 for ϕ.
Conclusion
This is only marginally different to the 1.8 we chose in early April (and there is a valid case for still using 1.8), which is a good vindication of our earlier work.
One thing that is incredibly apparent with all values of ϕ, and indeed with all predictions made elsewhere, is that none would have provided a range which included the final seat tally for the Conservatives, Labour or the Liberal Democrats. This emphasises how unlikely the actual result was, based on our understanding of the electorate before the vote.
We will therefore next be looking at how the model could further have been configured to get closer to estimating these unlikely scenarios.
General Election articles
Previous: The Aftermath(ematics)
Next: Bring on the Swing
Previous: The Aftermath(ematics)
Next: Bring on the Swing