Is xG changing the language of football?
I’m not a proper football guy – on the sideline with tracksuit on and notepad out. Anyone who knows me will attest. I’m not an analytics die hard either, armed with a spreadsheet and mathematical aptitude. I am (hopefully you’ll agree) a words guy.
So, I want to see if the words we use as fans, and hear from commentators, have anything to tell us about where the gaps in the stats are, and where they do better than we do colloquially. This comes partly from my love of words, and also from an interest in the oft-reported conflict between traditional football men and new analytics-driven decision making. I want to see if words, commentators and the armchair fan have a part to play.
Because a full analysis of all football stats would take all day, and more importantly is way beyond my abilities, let’s take the sexiest one – xG. I won’t explain xG in detail here – if you need a primer check this out. One thing I do want to note is that xG can be used in a few different contexts.
- You can use it to analyse individual matches, which is the way most of us see it used on Match of the Day (…and Twitter). In this context, xG shows how many goals each team would have been expected to score in the game based on the chances they created.
- It can also be used to analyse a teams’ performance over the course of a season. Should they have more points than they do based on their xG difference? Is a star goalkeeper meaning they concede fewer goals than the chances they give up?
- Finally, xG can be used to analyse individual player performance. Does a striker score more or fewer goals than their chances suggest they should? This is called xG difference - the difference between their xG and their actual goals.
‘High, wide and not at all handsome.’
How badly have they missed it though?
We’ve all been on the terraces when a shot goes out for a throw-in (no, just me watching Brian Deane up front during his second stint for Leeds?). Morale and momentum can be important in the swing of a football match. Almost scoring a difficult chance can invigorate a team, while watching a striker completely butcher a very presentable opportunity can have a concurrently demoralising effect.
xG can only tell you whether or not a player missed a shot, not by how much. It potentially misses out on how a player’s misses can make the players around him, and even an entire ground, feel. This can have a very real impact on the final result. How can a team even hope to shoot for the stars if their strikers are shooting for throw ins?
‘Well, he doesn’t get many chances, but when he does he really takes advantage.’
One of the main problems with using xG to analyse individual players is that it can be thrown out reasonably easily by a small sample size. For example, I think this is an interesting list on the top players who are out-performing their xG but it does throw up some issues.
The first is that players (looking at you, Kevin de Bruyne) can rocket up the rankings with one long-range goal. The second is other players making the list by scoring 5 goals efficiently, without taking into account poor goalkeeping, deflections or good old fashioned luck that could influence such a small sample size. This critique goes for traditional stats too, of course – but I think it bears repeating for new-age stats as well.
‘She didn’t let her head go down, and now she’s grabbed the winner’
This is the opposite problem to the one I’ve just discussed. What to make of the striker who misses three big chances, but then finally does grab the winner that means his team go home with all three points? Usually, they’ll earn praise from the commentators, and even occasionally from the fans. Their xG difference will certainly have taken a hit though.
Now consider a player, in a similar position, who lets his head go down after missing a couple of chances and doesn’t get a chance for the rest of the game. His team draw. Next week, he pops up and takes a big chance. His xG difference might be better than the previous player’s, but his goal was arguably less important and his mental strength when the going gets tough is hidden by the stats.
‘He hasn’t scored for seven years, but we all know he’s a big game player Clive.’
Again, this is an extreme example. But a major thing that xG doesn’t account for is whether or not a goal is physically more difficult to score in a bigger game. Greater pressure, both from the importance of the situation and the likely scarcity of chances making each one more important and pressurised, suggests strongly that they might be.
Manchester United’s £75m man Lukaku has been criticised as a ‘flat-track bully’ this season. A quick look through his stats suggests there might be something to this - he’s only scored 3 goals against ‘big’ teams, having netted against Real Madrid, Sevilla and Chelsea. This is important for teams chasing trophies. We all remember how Drogba terrorised Arsenal and how the latter’s inability to beat the big teams was often cited as a factor in their failure to win the league.
This difference between the ‘same’ chance in a different environment is something xG struggles to quantify, although there are other stats that can help.
In time, the data from xG could improve to a point where you could place a difficulty rating on a chance based on the kind of match it came in. Right now though, that doesn’t exist (as far as I’m aware – if you’re more qualified than me and are reading this then please get in touch!)
‘They've put that one on a plate for the striker there...’
In terms of the quality of a player, and their value to the team, there’s a difference between a player who creates and misses their own chance, and a player (*cough* Benteke) who skies a chance laid on for him by another player (*cough* Zaha).
Using a stat like xG difference alone doesn’t account for this. I understand that the type of chance and how it was created is included in the difficulty, but that’s not quite the issue here. The problem is in evaluating a player who missed their own chance which wouldn't have existed without them against a player who wasted the efforts of a team mate.
This can be fixed by using stats in conjunction with each other, including chances created, successful dribbles and even assists. My point here isn’t about stats being useless, just that xG shouldn’t be used as a standalone to evaluate players by clubs, fans or the media.
xG as the destroyer of narratives and empty words
I really like xG stats next to the actual goals for a game though. I think this is often a really useful narrative-buster. As an example, I pulled out the xG for a game (or two) with a narrative I don’t think holds up to the scrutiny of xG. The games are the two legs of Tottenham’s Champions League quarter final with Juventus which has been seen as part of the narrative that Spurs are bottlers. As Chiellini put it, “It’s the history of Tottenham. They always created many chances to score so much, but at the end they miss always something to arrive at the end.”
The score of the first leg was Juventus 2 - 2 Spurs. The xG from the first leg read: Juventus - 1 (+2 pens) and Tottenham - 1.7. This largely tells the story of that game. Spurs ahead on the balance of play, with a couple of big moments (one penalty and a good chance for Higuain) which Juve took advantage of – ending in a draw. Spurs did well to score 2 goals, beating their xG of 1.7, while Juventus took two of their big chances, and squandered a third when Higuain hit the bar from the spot.
The second game is even more interesting. Let’s take a look at the traditional stats from that game.
Score: Spurs 1-2 Juventus
Possession: 54 – 46
Shots: 23 – 8
On Target: 6 – 3
Ok, the bottling narrative looks pretty good to me here. Spurs were on top, created chances, and then failed to take them - shooting about half as accurately as Juventus in terms of shots on target.
Now lets check out the xG:
Spurs 1.8 - 1.5 Juventus
The xG still shows Spurs marginally on top. But it’s much closer, and explains why the difference in shots on target between the two sides was much smaller than overall shots. Juventus’ shots came from better positions, where they were more likely to score, or at the very least hit the target.
I think the xG for this game paints a much fairer picture than the traditional stats, and than the narrative. Yes, Spurs created more chances, but Juventus were solid at the back, with Chiellini and Buffon in particular demonstrating why they’ve been so successful for so long. They allowed Spurs to have shots only from difficult positions, and made life awkward for the players taking those shots. Then Allegri showed why he’s one of the most coveted managers in world football. He changed the game for 10 minutes, in which time Juventus created two chances worth almost the same in terms of quality as all the chances Spurs created throughout the 90.
The xG shows a 1.8 to 1.5 win for Spurs. So, xG shows they should have won, but that in terms of putting the ball in the back of the net, Spurs didn’t necessarily bottle it, they just came up against a well organised unit. I think xG provides a better way to look at this game than either the traditional stats, or Chiellini’s romantic notions of history.
The future of xG
Clearly xG is here to stay, and I like it. I’m also glad to say, though, that there are still some things which all football fans (and even commentators…) know instinctively which xG can’t quite quantify. I’m sure this won’t last…