Jan 9, 2010

Soccernomics and the misbegotten quest to turn soccer into a statistical sport

Don't get me wrong, the book Soccernomics by Simon Kuper and Stean Szymanski is a quick and entertaining read and teaches a few solid lessons. It provides some pretty compelling insight into England's woes in particular and manages to shatter a few myths about the business of soccer. But the book falls short of its ultimate goal, to uncover new, "data-driven" truths that will revolutionize the way the sport is coached, scouted and managed. If you're looking for soccer's version of Bill James' Baseball Extracts, this ain't it. In fact, perhaps more than anything else the book demonstrates the perils of trying to turn soccer into a statistical, data-centric sport; it simply tries to do too much with too little. You're left with a lot of extrapolation, most of which is likely to be disproved before the end of the next World Cup.

The book's main points are this: Rich, prosperous countries and municipalities have more success than poor ones, though there are two notable exceptions (England and Brazil). The transfer market is very inefficient because people who manage soccer clubs, despite their success in other endeavors--or perhaps because of them--do not make good decisions when it comes to managing their clubs' resources (again one notable exception: Lyon). Soccer is not only not big business, but actually rather small potatoes. England are crap and will probably never win another World Cup.

The chapter on England that opens the book is also its best. Hopefully England supporters will read it before the World Cup. Then, when we (the U.S.) beat them in the opening match it will be less of a surprise--and also less of a catastrophe--for the sport's mother country and its bloodthirsty press. So why are England crap? Simple: It has never "developed resources" beyond its working class roots. The English national team is still largely made up of proletarian yobs. To illustrate, the book provides a table with members of England's last three World Cup teams and their fathers' professions. Besides the ones whose dads were professional soccer players or coaches, only David James, Peter Crouch and Gareth Southgate appear to have middle class backgrounds. "When you limit your talent pool, you limit the development of skills," Kuper and Szymanski write. Yes indeed.

Okay, then what about countries like Nigeria, Russia and Mexico, all of whom have soccer-mad populations north of 100 million but none of whom ever appeared in a World Cup semifinal? The same reason, really: managing resources. "People all over the world might want to play [organized] sports, but to make that happen requires money and organization that poor countries don't have."

Here is where the authors' thesis starts to get a bit dicey. How do they explain Brazil, a poor country that has won more World Cups than anybody? Or Argentina, which wasn't exactly rich when it won World Cups? They acknowledge Brazil is an anomaly, but say Brazilian players are overvalued on the transfer market. Then they laud the success of Olympique Lyon, who have somehow managed to "buy low/sell high" almost exclusively with Brazilian imports.

They also have high praise for Arsene Wenger. It's hard to argue that the Frenchman hasn't done great things for Arsenal and that his methods haven't reinvigorated the game in England. But despite being one of the richest clubs in the world, Arsenal has won little silverware in England and none in Europe since Wenger's arrival. Manchester United, par contre, have had unparalleled success the past two decades even though the team's (Scottish) manager does not have an advanced degree in economics and presumably employs none of Wenger's new-age methods.

It just doesn't add up. The Soviet Union had a run of almost 50 years with a highly organized system of more resources than anybody else but didn't win anything. When its clubs did win, it places like Tbilisi and Minsk, not population centers like Moscow and Leningrad. Mexico may not be rich but its clubs have more money (and resources) than anybody outside Western Europe. The first African nation to make inroads internationally (Cameroon) does not even have the 10th-largest population on the continent and is certainly not its richest.

The authors' curious choice of Iraq as an "emerging" soccer nation is even more questionable considering it is right next to Saudi Arabia. The countries are comparable in population size, but one would think the Saudis have more money and organization dedicated to soccer these days. Another country they tapped for soccer greatness, China, has very limited success with team sports of any kind (despite its resources). South Korea has both resources and the know-how to manage them and made the semifinals of the World Cup to boot, but the book barely mentions the Taeguk Warriors.

In the end, it comes out to a typical example of over-reaching to make data fit your ideas rather than vice-versa. You can't fault the authors for trying, but it's a losing proposition from the word go. Unlike sports such as baseball and (American) football, soccer simply does not lend itself to statistical analysis. It just isn't wired that way. The game cannot be parceled up and broken apart with numbers or even facts. The story of a soccer match cannot be told in its box score and there is still no statistic that properly measures a player's contributions. This is starting to change with metrics like tackles, passes and distances run, but the sample size is very, very small. Moreover, even the crudest data, goals scored and against, does not always reflect the reality of what transpired on the pitch. In soccer, the best team does not always win. Over the course of a full season, the best team usually (though not always) ends up winning more than the rest, which is why you need a single table and full home and away schedule to determine a righteous champion. But neither the World Cup, nor its qualification pre-tournaments have this, which is one reason why international matches cannot be trusted as a proper metric for statistical modeling. The European club tournaments aren't much better, though they have been more just in the Champions League era (with its group stages) than before, when each round was drawn completely at random. Yet these make up most of the book's data sample.

So Soccernomics has no chance. The data is flawed to start, and the authors do it no favors by extrapolating to make points that aren't there to begin with. It's lose-lose. Kuper and Szymanski (and their editors) deserve credit for producing a work that is easy and fun to read and raises some interesting questions. But soccer will never lend itself to complete statistical analysis for the same reason that films, artists and actors won't. It's just too visceral.


  1. Spot on - I feel like the authors grabbed a sample group, already had a thesis, and then tried to spin the results to fit the thesis, not vice versa.

    Still, I enjoyed the part on penalty kicks and game theory immensely. In part because I take spot kicks, study them religiously, and think the mind games/trash talking is fantastically entertaining.

  2. True enough. The penalty kicks/game theory stuff was very insightful. I forgot about that.

  3. Interesting. To an extent, I think there are fair elements:

    a)Statistical analysis of players can show 'hidden value'. For example, anyone looking at Lucas Leiva via media attention would think that he was the worst player around. He does however have the highest tackle rate of any midfielder and is the most accurate passer. Some may say well 'passing it sideways to Mascherano is easy' but that isn't the point here. If he is getting the ball and distributing it well he is doing the job of a defensive midfielder. I'll concede that the book doesn't go into that in great depth.

    b) It is easy to conflate statistical analysis of outcomes and of ability. The book doesn't tell us that there are great ways to evaluate ability but does try and look at past results. I have problems with this as you do - and, indeed, blogged about it:


    c) The only time they look at value of players is saying that certain nations are over-valued - this is almost certainly true. Buying Brazilian or Dutch is to an extent a quality marque: We know that they probably have had excellent training and see past ability markers. Why not look to Africa?

    d) Lyon - yes, great example from them but I think Porto would have been a better one. Winners of the UEFA Cup and Champions League and constantly recruiting cheaply from (erm) Brazil only to sell them on at great prices.

    Incidentally, how much of Lyon's success is down to Chelsea being loaded? Michael Essien and Florent Malouda make up £37m of their income over the last few years.

    Wenger's Arsenal would have been interesting to study. He seems to (often) identify talented players and buy them for very little - Sagna, Clichy, Viera, Anelka, Diaby, Song. All of these will be sold on for profit at some stage.


    An interesting piece here:


  4. Had I written a proper review and not a Twitter post (thanks for the nod), I'dve certainly said similar things. This book would've been brilliant if the authors dropped some points that didn't make sense. As a "writer," I know how hard it is to eliminate 1000s of words once you realize your point isn't as strong -- it's very difficult. But still... good read, and some compelling stories.