Binomial Ranking – Part 2

In my latest post I introduced a new system of rating players on their finishing skill. I thought I’d use this post to show some real examples to see how the system fairs.

I haven’t started to collect my own data yet but thanks to the the good people at http://www.americansocceranalysis.com/ who provide publicly available player level data, I can run an analysis of MLS players from the past 5 years. Below is a table of the top finishers in the past 5 MLS seasons based on their binomial probabilities (explained in the previous post). I compared the Binomial method to three other methods

Goals – Expected Goals [Total]
(Goals – Expected Goals) per 50 shots [Per shot]
Goals/Expected Goals [Ratio]

Figures are given for these methods in the blue columns and then players’ respective rankings in the orange columns. I set a cut off of 50 shots on the data.

A couple familiar names at the top of the list with Henry and Beckham showing they still have top quality even without the comfort of rainy Premier League away days.

One of the main things to notice from this list is how the binomial probability method accounts for the main flaws of the other three methods:

Total – Doesn’t account for number of shots taken

Per shot – Punishes players with high number of shots

Ratio – Punishes players with high xG values

The 8th highest player in G-xG, Chris Wondolowski, only has an 81.85% chance of being an above average finisher, still pretty good to be fair, but only 40th in the MLS. This is due to his 482 shots, more than any other player

Players with low shot numbers like Lloyd Sam perform well on a per shot basis although the lack of reliability, and the probability that his stats just happen by chance, are reflected in his lower binomial probability.

Players with high total xG values like Thierry Henry and Robbie Keane really struggle based on the G/xG method, even though both players have a >95% chance of being better than average finishers

Since I love creating explanatory examples so much let’s give it a go to explain these flaws

Imagine a player “Jeremy Fisher” who is a perfect finisher. He scores every shot he has, but he’s very selective about his shots, and only shoots when he has an xG of 0.96. If Mr. Fisher takes 125 shots how would all these methods rank him in comparison to the rest of the MLS?

From this we can see that the binomial probability method is the only method that ranks this “Perfect finisher” at the top of our list of MLS players, with the G/xG method giving a ranking outside the top 100 due to a very high total xG value.

Obviously this is a pretty extreme example but I think it shows the drawbacks of the other methods fairly well.

Also I like the idea that players’ finishing skills (and most other skills) are modeled by a probability distribution and therefore each player is a random variable explaining the variance in player performance. This means we can asses the probability the a players finishing skill is above average, which is what the binomial probability produces.

I’ll probably finish off this little series of blogs tomorrow using binomial probabilities to assess Goalkeepers shot-stopping performances

If you have any questions please feel free to tweet me @_peteowen

Sporting Logic

An analytical overview of the sporting world