Hockey Analytics – Other Bits

Okay, there are a few other things that can be included in the hockey analytics umbrella, so this post will briefly glance over them.

Sample Size – This is always a criticism of statistics in general. How big is a good sample size? Some would argue that one game is surely not enough, neither is a playoff series, nor is a season.

Even strength – not a statistics per se, but the majority of statistics are gathered during even strength play. A game is at even strength when both teams have the same number of players on the ice (5v5, 4v4, 3v3), but 5v5 is the best situation to give clearer results. A power play (or man advantage) happens when team B receives a penalty and team A has one more man on the ice e.g. 5v4. Conversely, team B will be on the penalty kill, or are considered shorthanded, as they have one less player on the ice. There is also a term called score close. During the 1st or 2nd period, it is when the game is tied or there is only a one goal difference. In the third period, it is only when the score is tied. Nil-nil is included in this. In an experimental sense, 5v5 close can be considered a control, a penalty kill or power play can be considered as experimental variants.

Score Effects – Removing score effects provides more reliable Corsi and Fenwick data over time. When a team has a lead greater than one goal, particularly in the third period, it alters play, and the winning team becomes more defensive. In soccer, this is known as parking the bus. In an attempt to level the score, the losing team begins to make an increased number of shots, maybe a little bit more wild and ambitious, and spend more time in the offensive zones, and there may also be a possession shift. This skews the data so that is also why data is examined when the score is tied.

Goalies – It’s hard to measure a ‘successful’ goalie. Measuring just losses is not fair on a goal keeper as it’s too team dependent, and goal against average isn’t fair for either as the goalie can’t control how many shots he’ll face. Save percentage is used because it doesn’t punish great goalies on bad teams, and using even strength save% eliminates the quality of special teams.

Scoring Chance – Basically, it is any attempt to score a goal from a slot (the area between the faceoff circles directly ahead of the goalie). This data isn’t released by the NHL to the public, whereas the majority of the rest is. It’s a way to differentiate between shots that should be quite easy to save and ones that are not easy saves.

Advertisements

Hockey Statistics – Other Stats & Luck

The NHL is beginning to board the statistics train and in addition to the Corsi/Fenwick umbrella and plus/minus, there are a number of other stats we can look at to examine play.

Takeaway/Giveaway – these are possession stats again. Possession = success. You need to have possession of the puck to give it away, likewise for a team to take the puck then they weren’t in possession of it to begin with. If you spend the season racking up high numbers for these two then it shows that the other team’s you’re competing against tend to have the puck a lot more than you; possession is positively correlated to wins.

Those two are real time stats, hits are included in this, but they are subjective and unreliable, for example as a result of home/away bias in the rink.

Face Offs  – Face offs occur in the offensive, defensive, and neutral zones and can result in a win or a loss for the player involved, so this is another stat we can use to examine their effectiveness.

Time on Ice – Players like Duncan Keith rack up a lot of minutes on the ice, and we know he’s a good player, then we have 4th liners who don’t get a lot of ice time, but it’s not really much more informative than that.

Statistics are all well and good, but there will always be an element of chance. Gabriel Desjardins estimates that 75% of wins are due to Corsi scores plus luck.

PDO 

PDO is also known as SPSV% by the NHL. PDO was proposed by Brian King, who happened to use those three letters as his online handle.

PDO = (shooting%) – (save%)

It is used as a proxy of how lucky a team is. This is because shooting% is primarily luck driven, and save% is primarily luck driven. It’s based on the theory that most teams will regress towards a PDO of 100. Anything that is significantly above or below that is extremely lucky or unlucky.

 

 

Hockey Statistics – Corsi & Fenwick

In the last post, I explained how plus/minus should be banished to the fiery depths of hell. So what can we use instead? Jim Corsi to the rescue! You have probably heard of Corsi scores and it’s really not that complex. Jim Corsi, a goalie coach for the Buffalo Sabres, tracked shot attempts to measure the goalie’s workload then the Corsi score was developed.

corsi

The formula is not particularly difficult. In its rawest form, it’s shots for minus shots against.

SF = on ice shots for SA = on ice shots against
MSF = on ice missed shots MSA = on ice missed shots against
BSF = on ice blocked shots BSA = on ice blocked shots against

 

Corsi basically shows that you have more shots on net likely as a result of more possession, which shows you’re good at generating chances.

There is also a similar measure called Fenwick, which is the same as Corsi minus the blocked shots. For example, if Toews is on the ice with 10 shots for and 3 against, he’d have a Corsi score of +7 (10-3). Now, if we say that of those ten shots for, two were blocked, and for the three against, one was blocked, Toews’ Fenwick score would be +6 (8-2). Sometimes the two are expressed as percentages to make them easier to compare.

So why is Corsi useful? Well, it is shown to have a close correlation to scoring chances, puck possession and territorial advantage. It’s a good predictor of success too.

zi0yJGp.jpg

This beautiful chart looks at Fenwick scores from 2007-2013. The rings represent each round of the playoffs and the further a team is from the outer ring is the distance from a playoff berth. Possession percentages are shown as between .400 to .600; the further from .400 the better. +.500 is the magic number as  it gives you a 75% chance of making the playoffs; +.550 gives a 25% chance of winning the Stanley Cup. Not bad.

(Outliers do exist of course, but on this chart only one outlier won the cup… the 2009 Pittsburgh Penguins. What happened in that season? They fired their coach and hired Disco Dan who gave them a .599 team Fenwick thus leading them to their cup).

Corsi and Fenwick are good, but remember CONTEXT. The score can be altered by other variables, like their team mates, the competition, or zone starts. So… time for some developments.

Corsi Relative – Measures the difference between the player’s and the team’s when he’s on the bench. This allows us to see what effect the player has on the team. Is the player stuck on a bad team?

Corsi Relative Quality of Teammates – Examines the teammates weighted by time on ice together; Does the player have bad line mates?

Corsi Relative Quality of Competition – Examines opposing players and is weighted by head to head ice time; is the player lining up against the best?

We can also examine zone starts to see if the player is more of a defensive type. This is a usage stat and can be used alongside time on ice to understand how the coach uses the player. There are three types of zone  starts: offensive (OZS), defensive (DZS), and neutral (NZS), that show where the player is employed. And time on ice shows us how the coach uses players: even strength (EVTm%), Power Play (PPTm%), and shorthanded  (SHTm%).

A final note about Corsi/Fenwick is that they are not perfect. However, scouting is not perfect. Neither is a fan’s opinion. Nothing we have (yet) is perfect, but it is something.

 

Hockey Statistics

In an attempt to understand and de-mystify the advances statistics involved in hockey, I’ll be posting a few blogs to break them down. Firstly, hockey calls them advanced statistics even though they’re not particularly advanced, and secondly the NHL is very slow on the implementation of stats in comparison to the MLB or NBA, thirdly statistics are nothing to be freaked out over.

One of the easiest stats to figure out is plus/minus (+/-). A player gets a + every time he is on the ice for an even strength goal. A player gets a – every time he is on the ice for a goal scored against the team. Fairly easy, right?

The table below shows the top five and worst five +/- scores from the 2015/2016 season. I wouldn’t have predicted those guys. Can you imagine a top 5 without at least Crosby, Ovechkin, or Kane popping up? The problem with +/- is that it’s great in theory, but in practice it is too basic. Good players with  bad team mates get dragged to hell, whereas bad players with good lineys get their score boosted.

Player +/-   Player +/-
Tyler Toffoli +35   Mikkel Boedker -33
Anze Kopitar +34   Radim Vrbata -30
Brian Campbell +31   Bo Horvat -30
Chris Kunitz +29   Jordin Tootoo -26
Colton Parayko +28   Elias Lindholm -23

It’s a good way to distinguish good defensive players, but it fails to explore the quality of team mates, the competition, goaltending, or how a player is used in the game. And this is important contextual information. You should never look at the statistics in isolation, it needs context.

Let’s take a look at our favourite Russian, Alexander Ovechkin. Between the 2013-2014, Ovi posts a -35. Holy cow. That’s third worst in the entire league, only beaten by Steve Ott and Alex Edler. But if we look at the goals he scored that season, it’s the highest in the league with 53. If he scored that many goals, why is his plus/minus score so atrocious? He’s a lazy player. He’s not a good defensive player. He hogs the puck. Okay, we know he’s not the best defensive player, but there has to be more to it than that. As for assists, he recorded 28 that year – which is by no means bad – but actually he was making the passes, but his teammates were not scoring from them, (we’ll cover that later), and the goaltending was shaky. But if we know nothing about hockey and look at this easy to understand stat then we would most likely assume that this Alexander Ovechkin guy is an awful player.

To further illustrate how the plus/minus figure can be easily skewed, here are some examples.

Scenario: We have two identical hockey players (who happen to be the Sedin twins) but let’s say they shoot ten times in sixty minutes and allow ten shots in sixty minutes on ice. They both have an on-ice shooting percentage of 10% and an on ice save percentage of 91%. Both players play 20 minutes per game. Every play, skate, pass etc is identical.  

sedin

Example 1: Player A plays 20 minutes every night, but now Player B plays only 10 minutes. Over the season Player A will have double the plus/minus score as Player B because he is playing for longer each night to hit that 10 goals per sixty minutes of ice time. It ends up looking like Player A is twice the player that B is when they are actually identical players.

Example 2: Both Players are identical again, and both play for the exact same amount of time. However, the goalie for Player A’s team records a save percentage of .930, and Player B’s goalie has a save percentage of .890. More goals will be scored against Player B’s team so he will get a bigger minus and so Player A has the better +/- score – despite goal tending having nothing to do with them.

The take home message is that a plus/minus score is highly variable and many things can sway it. You could get a +3 from Monday’s game and a -5 from Tuesday’s. It’s not the best method to use and ignores context! So please can we abandon it for good?