The Game Is A Foot
Created 11th January, 2012
  • Home
  • Articles
    • Sport Articles
    • Non-Sport Articles
    • Guest Articles
    • Discoveries
  • Simulators
    • World Cup 2014
    • Premiership 2014/15
    • General Election 2015
    • Oscars 2015
    • Six Nations 2016
    • Euro 2016
    • US Election 2016
    • General Election 2017
    • World Cup 2018
    • Euro 2020
  • More
    • Contact
    • Reading
    • Competition
    • Election Live 2015
    • Election Live 2017
    • General Election 2019
  • GE24 Live

xGoals in the SPL: an introduction

4/9/2017

0 Comments

 
by Dr Adrian Worton

This website has a lengthy history of football analysis. However, we have lagged behind in the most important metric of football analysis, one has even begun to be included on Match of the Day (which is hardly at the forefront of analysis). This metric is 'expected goals', also known as xGoals or xG.

For a brief description of how this metric works, you can click the header below to bring up an explainer. For those of you who are already familiar with expected goals, you'd be better off skipping it as you will have probably read dozens of explanations elsewhere.
What are expected goals?
One of the most-used statistics in football is Shots Taken. You might look at a match report and see that a team had 15 shots against its opponent's 10, and assume that they went on to win the match.

However, not all shots are equal. For example, a tap-in to an unguarded goal from 6 yards is more valuable for a team than a 30 yard speculative effort. xG is simply a way of measuring the quality of a chance. Specifically, it says how likely each chance is to result in a goal.

So, for the two examples above, the 6 yard chance may be scored 95% of the time, whilst the shot from 30 yards may be scored 2% of the time. So the xG for the two shots are 0.95 and 0.02 respectively. Let's say both shots were by the same side - this means we can say that in total we'd expect them to score 0.97 goals (0.95 + 0.02).

In order to know the xG of a chance, past data is used to give an idea of what we would expect. A simple example is for a penalty. We may look back and find that 75% of penalties are scored, which means that the xG of any penalty would be 0.75. 

For non-penalties it is more complicated as we need to find ways to quantifying how good each chance is. It depends on the model used as to what metrics are included, and the more sophisticated the model, the more factors they will include. Generally, things such as distance from goal, the number of players in the way of the shot and how the player hit the ball (e.g. header, dominant foot, weaker foot) are the kind of things that may be used.
The main reason that we haven't looked at them yet is because in order to use them, you need a comprehensive data set to build up a model you can rely on. It's fair to say that the most important factor in calculating xG is the distance from goal. However, finding football statistics that include positional data can be expensive and previous beyond the scope of this site.

However, we have been lucky enough to be given access to some very valuable data by StrataData, who look at various leagues across the world. Crucially, this includes data on the Scottish Premier League (SPL), which isn't covered by the other main providers of detailed football data. This is particularly interesting to me as I live in Scotland. Therefore, we will be using this data to analyse the SPL.

We will be looking at teams and individuals in future articles, but for now we will just go over how our xG model works.

Our model

To add depth to our model, we have data from four additional leagues to the SPL, all of approximately similar quality.

As mentioned in the xG explainer above, models can be continually refined by taking more and more factors into account, trying to use various metrics to estimate how good a chance is. However, the beauty of the StrataData data is that each shot is assessed and placed into a category describing how good the chance is, and how likely it is that it will result in a goal. The categories are various ratings ranging from Poor to Superb. 


We then calculate how many times across our data chances of each category are converted, in order to come up with an xG value for each category. 

With estimations of how likely each chance is going to be scored, we can then include a different measure for each match, and one which is potentially more interesting, that is the likelihood of victory.

Likelihood of victory

Because we know the likelihood of each chance resulting in a goal, we can also work out the likelihood of each team winning a given match, given the chances they had.

To take an incredibly simple example, say Team A & Team B had a match, where there was only one chance, and it fell to Team A. If the xG of that chance is 0.25, then there is a 25% chance of Team A winning, and a 75% chance of the match ending in a draw. 

For a slightly more complicated example, say Team A had two chances, which had xG of 0.05 and 0.4, whilst Team B had one chance with an xG of 0.65. We can work out the probabilities by taking each chance sequentially (the order doesn't matter).
​
After chance 1 (Team A; xG of 0.05), the score probabilities are:
  • 0-0: 95%
  • 1-0: 5%

After chance 2 (Team A; xG of 0.4), the score probabilities are:
  • 0-0: 95% x 60% = 57% [the probability that the score was 0-0 and this chance was missed]
  • 1-0: (5% x 60%) + (95% x 40%) = 41% [the probability that the score was previously 1-0 and this chance was missed, added to the probability that the score was 0-0 and this chance was scored]
  • 2-0: 5% x 40% = 2% [the probability the score was 1-0 and this chance was scored]

And finally, after chance 3 (Team B; xG of 0.65) the score probabilities are:
  • 0-0: 57% x 35% = 20% [the probability the score was 0-0 and this chance was missed]
  • 1-0: 41% x 35% = 14% [the probability Team A led 1-0 and this chance was missed]
  • 2-0: 2% x 35% = 1% [the probability Team A led 2-0 and this chance was missed]
  • 0-1: 57% x 65% = 37% [the probability that the score was 0-0 and this chance was scored]
  • 1-1: 41% x 65% = 27% [the probability that Team A led 1-0 and this chance was scored]
  • 2-1: 2% x 65% = 1% [the probability that Team A led 2-0 and this chance was scored]

To see each outcome's likelihood of happening, we add up the probabilities of each scenario that matches this outcome:
  • Team A win: 14% + 1% + 1% = 16% [the probabilities of Team A winning 1-0, 2-0 and 2-1]
  • Draw: 20% + 27% = 47% [the probabilities of a 0-0 and a 1-1 draw]
  • Team B win: 37% [the probability of a 1-0 Team B win]

So from this, we can see that Team B are more likely to win than Team A, but that a draw is still more likely. Of course, a real match will have a couple of dozen chances, but the principle remains the same.

After a match fans and pundits often argue that the result "should" have been something different to the final score. This measure allows us to actually give a quantitative measure of this, and when taken over several games will be very useful in seeing whether a team deserves to have the points tally it has, or whether it has been lucky/unlucky.

Summary

Expected goals have been covered in huge detail elsewhere, and our likelihood of victory measure is nothing new, either. However, by combining this with our exciting new data, we should be able to use it to provide new insights, particularly into the realm of Scottish football.

Next time we are going to look at the twelve SPL sides and see how they have been performing thus far this season.
​
​This article was written with the aid of StrataData, which is property of Stratagem Technologies. StrataData powers the StrataBet Sports Trading Platform, in addition to StrataBet Premium Recommendations.
0 Comments



Leave a Reply.

    Tweet

    Author: Adrian

    Doctor of Mathematics and former football analyst.

    Categories

    All
    Ashes
    Baseball
    Cricket
    Football
    Horse Racing
    Motorsports
    Olympics
    Paralympics
    Prem Simulator
    Rugby
    Snooker
    Sports
    World Cup Simulator

    Archives

    April 2024
    January 2024
    December 2023
    April 2022
    June 2021
    January 2021
    December 2020
    July 2018
    June 2018
    March 2018
    November 2017
    September 2017
    March 2017
    February 2017
    June 2016
    February 2016
    June 2015
    April 2015
    March 2015
    February 2015
    September 2014
    August 2014
    July 2014
    June 2014
    May 2014
    March 2014
    January 2014
    July 2013
    September 2012
    June 2012
    May 2012
    April 2012
    March 2012
    February 2012
    January 2012

    RSS Feed

Tweet
Proudly powered by Weebly