Sunday, February 1, 2009

Even Strength Shooting Percentage

To what extent is team-to-team variation in even strength shooting percentage the product of random variation? I'm not sure what the answer is, but I suspect that the contribution is substantial. I've included several graphs below in order to illustrate this. The table below the first graph contains the data upon which each distribution is based.



The first graph. The yellow line is the actual spread in EV ( note: 5 on 5 only) shooting percentage that exists among NHL teams at this point in the 2008-09 NHL season.

The X-axis contains the percentage 'categories' in which the figure listed is the midpoint value of the category.

They Y-axis is the relative frequency of each individual percentage 'category'.

As an example, 6 teams in the NHL this year currently have an EV shooting percentage that is between 0.08 and 0.085. As there are 30 teams in the league, the relative frequency is 0.2 ( as 6/30 = 0.2). The midpoint value for this category is 0.0825. Therefore, the relative frequency of the '0.0825' category is 0.2.

The pink line shows the predicted spread in EV shooting percentage if each team had the exact same underlying shooting percentage at ~0.085 ( i.e. the league average 5-on-5 shooting percentage). This was determined through the following.

1000 "seasons" were simulated.
For each "season", each team has an artificial shooting percentage.
This percentage is the number of goals that a team scores over x number of trials.
The number of trials is equivalent to the number of EV shots that the team has taken through this point in the season.
The probability of "scoring" in each individual trial is the same for every team at 0.085.
Therefore, any team-to-team variation will be the product of randomness.

A specific example will hopefully make this clear.

Philadelphia has taken 984 shots at EV at this point in the 2008-09 season. Therefore, Philadelphia has 984 trials. The probability of scoring in each individual trial for Philadelphia is the league average EV shooting percentage at ~0.085. In Philadelphia's first "season", they scored 107 times. As 107 / 984= ~0.109, Philadelphia's EV shooting percentage for their 1st "season" is 0.109.
I then did this for every team and repeated the process 100 times (i.e. simulated 100 seasons). Here's how the first 48 or so shaped out:




Even though the probability of a goal on any given "shot" is 0.085, the artificial shooting percentage will necessarily differ from 0.085 due to insufficient sample size. While it goes without saying, as the sample size (number of trials) increases, any given team's artificial shooting percentage will more closely approximate 0.085. Therefore, for teams that have taken more shots through this point in the 2008-09 season will have more "trials". The spread in shooting percentage for these teams will be lower due to them having a greater number of trials. For example, the standard deviation for Detroit's 100 seasons is ~0.007. By comparison, the same value for Pittsburgh is ~0.009.

The same rules regarding the x and y axes that apply to the yellow (actual) distribution also apply to the pink (random) distribution. The relative frequency for the pink distribution is the proportional representation of each artificial shooting percentage category. As an example, as there were 100 "seasons" and 30 teams, the entire sample consisted of 3000 artificial shooting percentages. 601 artificial percentages fell between 0.08 and 0.085. The relative frequency for the '0.0825' category is therefore ~0.2, as 601/3000 = ~0.2.


As many will note, the spread between the worst ( NYI at 0.069) and best ( BOS at 0.108) teams appears to be sizable, as is indicated by the breadth of the yellow distribution.

However, the pink distribution is itself fairly broad. In fact, it very closely resembles the yellow distribution. As would be anticipated, the yellow distribution is slightly broader than than its counterpart, but the difference is not large. This suggests that much of the inter-team variation in EV shooting percentage is the result of randomness.


The second graph, shown above, contains a 'smoothed' version of the actual distribution, which is represented by the dark line. The average shooting percentage in the league is currently ~0.085, as has been mentioned. The standard deviation is currently ~0.01. The dark graph is simply a normal distribution (bell curve) with a mean of 0.085 and standard deviation of 0.01.

The light line is merely the pink distribution reproduced. Again, the two distributions are very similar to one another.

The fact that the actual distribution is somewhat broader than the expected distribution shows that teams do indeed differ in their underlying shooting percentage at EV. Nonetheless, this variation is only very slightly larger than what would be predicted by chance alone. The underlying differences appear to be minimal.

Vic Ferrari
has done a lot of excellent, excellent work over at his site that is similar to this. Much of his work has examined the ability of individual players to influence shooting and save percentage while on the ice. His findings are comparable in that the vast majority of inter-individual variation seems to be due to random variation.

EDIT: I've included some supplementary data tables for the purposes of clarity.

I should mention that the data I used for this post was obtained at behindthenet -- an awesome site that I highly recommend. Without it, this post wouldn't have been possible.

10 comments:

sunnymehta.com said...

Awesome. I meant to share something similar to this with you guys. A couple months ago, on my publisher's poker forum, I was commenting about shooting percentage in an off-topic sub-forum's NHL Hockey thread. A poster created the following chart...

http://forumserver.twoplustwo.com/showpost.php?p=7402193&postcount=698


Here's the take home point:

"So after the whole season last year, if you gave an average team an average season and let them ride ordinary variance they could at the end of the year rank anywhere in the top 25 in goals per game."

JLikens said...

Interesting post at the forum.

The whole issue really causes one to wonder about the differences in underlying ability between a 'good' percentage team and a 'bad' one, especially if the differences are especially pronounced at EV.

I mean, with teams like Detroit and San Jose -- teams that are so dominant territorially and on the shot clock -- there can be hardly any doubt at all that such teams are just fundamentally good hockey teams.

On the other hand, teams like the Hawks and Bruins may be elite in terms of goal differential, but one has to wonder how good they actually are, given the fact that their success thus far has almost entirely been the product of the percentages.

JLikens said...

Heh. I probably should have actually checked the Hawks' (even strength) shot differential before implying that they might not be a legitimately good team.

As it happens, they're pretty impressive in that regard.

My mistake.

sunnymehta.com said...

Yeah, you should have said Penguins. I'd really like to know what the deal is with them. They got outshot badly at ES last season and still posted a positive goal diff due to good percentages, and they're up to the same old shtick this season.

Is it luck? Is it a style of play thing? Is it driven by skilled players? I mean, heck, they really only have two super-skilled players.

And speaking of Crosby and Malkin, I was doing some comparing of them to Ovechkin. What is the deal? Ovechkin's Corsi is like +282 and Crosby's is like -40 and yet their on ice GF and GA are almost the same. lol. Wtf? Is this being driven by team? (Personnel as well as system?) I.e. - if we put Crosby on Washington would he have a SIGNIFICANTLY better Corsi?

You know, one thing that did occur to me is that maybe instead of using Corsi (which is an overall +/- of shots directed at net), we should break it down into SDF and SDA (shots directed for and shots directed against) or even SDF/60 and SDA/60 or even the ratio of [SDF/60]/[SDA/60].

Because, in theory, let's you have two players (we'll say they've played the same number of minutes). Player A has 100 SDF and 50 SDA (Corsi of 50), and player B has 70 SDF and 50 SDA (Corsi of 20). Player B is a very skilled offensive player and has a much better S% than player A. Since the two players have the same SDA, they could actually be dominating possession equally, it's just that when Player B is in the offensive zone, he's looking for better shots.

I have no idea if there's evidence of this sort of thing in practice today, but it occurred to me because when I was comparing Ovie to Crosby I noticed that their shots against aren't all that much different. Their shots for are pretty different, but that could be an indication of style more than an indication of possession dominance.

If I had to guess, there are probably only a few players who are skilled enough for this theory to make sense for though. For most players, a raw Corsi might be just as effective.

JLikens said...

Yeah, you should have said Penguins. I'd really like to know what the deal is with them. They got outshot badly at ES last season and still posted a positive goal diff due to good percentages, and they're up to the same old shtick this season.

Is it luck? Is it a style of play thing? Is it driven by skilled players? I mean, heck, they really only have two super-skilled players.


It was definitely partly luck, although I'm not sure to what extent.

For example, the Pens had far and away the best PDO number in the league last season, which strongly indicative of good luck.

That said, I'm more inclined to view their high EV SV% as being the result of good luck than I am with respect to their EV S%.

Both of their goalies in Conklin and Fleury posted excellent save percentages at EV, despite neither of them having excellent track records in that regard. Furthermore, Ryder's data showed that the Pens were roughly average in SQA last season, so it wasn't as if the team was especially good at limiting scoring chances. Both of those factors led me, at the time, to believe that their 0.930 team save percentage at EV was primary the result of good fortune. The fact that that number has entirely regressed to the mean this season basically confirms that suspicion.

The EV S%, on the other hand, might actually be (partially) sustainable. Team style is definitely relevant. Having watched a fair number of Pens games over the past two seasons, Pittsburgh as a team is relatively reluctant to shoot the puck when they have possession in the offensive zone. The surfeit of offensive talent -- even if basically limited to two players -- doesn't hurt either.

Although it's possible that their low EV SF could be due to them simply getting dominated territorially, the fact that they tend to draw more penalties than they take suggests that other factors are at play. It's also worth mentioning that there is a mild negative correlation between EV S% and EV SF at the team level. Of course, they're currently shooting something around 0.105, which is ridiculously good and not realistically sustainable. But if someone were to suggest that their true EV S% was in the 9-10% range, I'd find it difficult to disagree.

And speaking of Crosby and Malkin, I was doing some comparing of them to Ovechkin. What is the deal? Ovechkin's Corsi is like +282 and Crosby's is like -40 and yet their on ice GF and GA are almost the same. lol. Wtf? Is this being driven by team? (Personnel as well as system?) I.e. - if we put Crosby on Washington would he have a SIGNIFICANTLY better Corsi?

Well, it's almost entirely the result of Crosby having the better on-ice EV S%. Their on-ice EV SV% and on-ice are virtually identical.
Again, stylistic differences at the team level (and differences in style of play between Crosby and Ovechkin, for that matter) are relevant here. I have no doubt that Crosby's corsi would be better if he played on the Capitals, seeing as how:

a) he plays on a team that just gets murdered in terms of corsi
b) he's has one of the best corsi numbers on his own team (once controlling for ice time)

Given all of that, I think that your theory about decomposing corsi into SDF and SDA is an attractive one, at least for skilled players (as you said). While EV SA might be a suitable proxy for territorial play, it seems that the same does not necessarily apply to EV SF. Players do have some ability to affect their on-ice EV S % -- thus, two players (Crosby and Malkin, in this case) can have vastly discrepant EV SF numbers while nevertheless having similar rates of offensive production. This, of course, is attributable to the player with less EV SF having the superior shooting percentage, which , in some case, would merely be the result of team/player stylistic differences, rather than luck or randonmess. Seems plausible enough.

Mike said...

My name is Mike and I run www.NHLsnipers.com. Just curious if you would like to swap links.

Let me know in a quick comment on my site or email me: nhlsnipers at gmail.

If so, send me the link you want listed.

Thanks

xanax said...

Wow, nice post,there are many person searching about that now they will find enough resources by your post.Thank you for sharing to us.Please one more post about that..

Buy Cheap Maxon Cinema 4D R12 Studio said...

In this blog is very nice post.so great finance advise to them.

Claudio Timbers said...

Hey, great blog, but I don’t understand how to add your site in my rss reader. Can you Help me please. buy proxies

Hostpph said...

Well I have to admit that it is quite hard to look to those chart because you have to click them to see them bigger.