Friday, 28 June 2013

Slicing and Dicing: Re-Examining Data on Draft Eligible D-Men

I love getting feedback and hearing from other people when I write stuff, even if it's simply to tell me that I'm an idiot for reasons X, Y and Z. It gives me ideas on how to examine my arguments in different ways and see if my initial conclusions were either flawed or could be refined. A couple of people made some good queries this morning that I feel are worth looking at, so that's what I'll take this time to do.

First off, @garik16 raises a valid point. Assists may be influenced by teammates in both positive and negative ways, so perhaps looking at goals per game played will yield a clearer picture of offensive ability than points per game does, and subsequently forecast NHL success more accurately:
As it turns out, the scatterplots for G/GP vs. NHL% of GP and  Pts/GP vs. NHL% of GP behave in very similar manners. Both have large clusters of non-scoring early-round busts at the bottom left hand corner, while the rest of the plot is pretty random. In both cases, there isn't enough of a linear relationship present to conclude a direct correlation between offensive output in the CHL and NHL success, however, as I said before, there is enough evidence to conclude that guys who don't score in the CHL don't become NHL players. Here's the plot:

It looks like there could be a linear relationship here if the picture wasn't clouded by so much noise at the bottom and so many outliers to the right. I'm not about to eliminate half of my data though, so we're left with this mess.

What I find interesting is that when you move to 0.25 CHL draft year G/GP, most of the guys end up flaming out before making the NHL. In fact, here are the top-10 CHL defensemen in this sample, sorted by G/GP:

Personally, I think this list is hilarious. The only "regular" NHLers here are St. Louis Blues spare part Kris Russell, journeyman Steve Eminger, and one of the all-time busts in Cam Barker. The rest of these guys can't even crack an NHL roster full-time. So how did they perform so well in their draft years? The most probable explanation is something that we see in the NHL all the time: individual shooting percentage is volatile in small sample sizes, leading to inflated goal totals over a single season.

It's circumstances like this where possession numbers as well as deployment patterns and shooting percentages would be very informative. After all, we're looking at such a small sample in "defenseman goals in a single short season." Possession metrics effectively expand the sample size allowing us to paint a more meaningful picture of what's happening on the ice over a shorter time period. Hell, even looking at just individual shots on goal should expand the amount of data available by 2000% (assuming a CHL d-man shoots at something like 5% on average). Unfortunately, this data wasn't available to me when I was data mining over on so it'll have to wait for another day, if it even exists at all.

So to finally answer what @garik16 was wondering, looking at G/GP doesn't tell us much, at least not about the data I have. This is probably because goals provide such a small sample of data that they're just too volatile to conclude anything more than what we already knew merely by looking at Pts/GP. While assists may introduce "quality of teammates" noise, eliminating them introduces a lot of randomness noise.


The other point that was mentioned by a couple of people was that guys drafted in the first round tend to be offensive guys, with the implication that guys drafted in the 2nd and 3rd rounds are regarded as riskier anyways:

First off, I'd hesitate to call it a "0.6 Pts/GP rule" since I think putting a set-in-stone number on it tends to make it look like I believe what I'm presenting is gospel. It isn't. I'm merely trying to present evidence to make the point that defensive ability is probably overvalued in CHL defensemen, and that GMs probably shouldn't spend early round picks on guys who exhibit no offensive ability.

Anyways, to test to see if my theory holds up when looking at the data round by round, I divided the total sample of players up in to three batches, based on Derek Zona's "Impact Player Percentage." This is essentially a percentage measure of how many players from a particular grouping in the draft turn out to be impact NHL players. Based on Zona's work, a top-25 pick turns into an "impact player" roughly 39% of the time, a player drafted anywhere from 26th to 50th becomes an impact player 15% of the time, and someone taken between picks 51 and 100 becomes an impact player about 7% of the time.

Of course, my definition of a "successful pick" is different than what is defined as an "impact player." Since my definition doesn't stipulate how frequently a player must score at the NHL level, my definition should yield more "successes" than there are impact players. Therefore, the success rate of NHL teams drafting a defenseman in each batch of picks should, in theory, exceed the impact player percentage of that same batch. In other words, since 39% of top-25 players become impact players, significantly more than 39% should be what I call successful picks.

I then divided each batch into two sub-groups of "scorers" and "non-scorers." If defensive defensemen are just as risky or less risky than offensive defensemen, the success rate of each sub-group should be similar to one another, and above the impact player percentage for that range of picks. Here's what the batch-by-batch relationship between NHL % of GP and CHL Pts/GP looked like:

What I notice right away is that a higher proportion of scoring defensemen are drafted in the first 25 picks compared to the 26th - 50th picks and the 51st - 100th picks. However, within these batches, does my initial conclusion that primarily defensive guys are still more risky hold true? I assembled the information into a table to check:

As you can see, scoring CHL defensemen turned into successful draft picks at a higher rate than defensive CHL defensemen in every single batch. Even though the success rate of non-scoring CHL d-men exceeded the batch impact player percentage (labelled as "impact rate") in both the first and second rounds, the difference is so small that I'd feel confident in saying that a well lower than average amount of these picks go on to become impact NHL players. An NHL team would be far better off using the pick to take a scoring defenseman or a forward.

So, after analyzing the data in yet another way, I'm left with the same conclusion: non-scoring CHL defensemen carry a disproportionate amount of risk and fail to become NHL players more often than not. I will say that I have softened my stance on Samuel Morin and Nikita Zadorov based on the information I looked at today, however I still believe that they're both unlikely to become NHL regulars and shouldn't be drafted if guys like Pulock, Morrissey or Theodore are still on the board. Even "risky" Jordan Subban is as likely to become an NHL regular as "safe" Zadorov is. Also, guys like USNT product Steve Santini should probably be bypassed entirely in the early rounds of the draft because their offence just isn't there. I hope I've done enough to demonstrate that defensive ability in the CHL is overvalued and that offense and puck skills have historically carried far less risk among draft-eligible defensemen.

Thursday, 27 June 2013

Defending The Defenseman Post

James Mirtle and I had a good Twitter discussion about the defenseman post, and I think his criticisms raised some really good issues and points that I probably need to clarify. I'll go a couple of them in depth because they're questions I can see being raised by others too.

1) "What about player X? He doesn't fit this pattern."
This is true that Chara wasn't a big junior scorer, but then again, how many 6'9'' players have ever played in the NHL? Bruce Arthur also makes the point that even among big men, Chara is somewhat of an anomaly with his immense on-ice ability. But even the data presented in my last post shows that guys who aren't big scorers in junior can go on to NHL success. To this point, I'd simply say that there are exceptions to every rule, and broad-brush analysis isn't going to be applicable to every individual case.

With this being said, there are still far too many examples of guys who were drafted for size or defensive ability that never turned into NHL regulars because they couldn't score. Branislav Mezei, Boris Valabik, Matthew Spiller, Ryan Parent, Matt Pelech, Libor Usturnl, Andy Rogers and Logan Stephenson are all players that were taken early in the draft because of their size or defensive ability, only to find themselves unable to make it as an NHL regular. The list goes on and on and on, too.

The point that I think should have been made more clearly in the last post is similar to the one Tyler Dellow made here: defensive defensemen at the NHL level far more often than not exhibit puck skill at the CHL level. Guys who are known as prototypical shutdown NHL D-men, including Barrett Jackman, Rostislav Klesla, Nick Schultz, Brent Seabrook, Dan Hamhuis, Tim Gleason and Karl Alzner, all were significant offensive contributors to their CHL teams in their draft years.

Of course, this begs the obvious question:
Based on the window of drafts I studied (1999 to 2008), and my cutoff point for an NHL regular (played in 40% or more of available NHL games), these are the guys who scored less than 0.5 Pts/GP in their draft year and have still become NHLers:

At first glance, that's an impressive list of names. However, you'll have to remember that this list is only 12 guys out of a sample of 105 that put up similar offensive numbers in their draft season, so these guys are anomalies. The other important fact to consider is that, with the exception of Fistric (probably the least talented player of this sample) and Schenn (who was in the NHL), every single player on this list became a significant offensive contributor their very next CHL season, as they all eclipsed 0.5 Pts/GP. Once again, guys who turn into NHL players, even guys we regard as pylons like Jeff Schultz, tend to be scorers in junior hockey.

2) "Offensive numbers don't do a good job of quantifying defense."

 I agree with James' point here. At the same time, I'm not evaluating defense on the basis of offense. As I touched on in the section above, defensive value of CHL defensemen doesn't seem to translate to the NHL if the player in question isn't a two-way force in his junior career. Almost all of the players that became good NHL defensive defensemen drafted in the first three rounds of the draft between 1999 and 2008 had a high scoring rate in junior, and the vast majority of early round busts in that same time frame were guys who simply didn't put up points in the CHL, as was shown by the data presented in the previous post.

This means that of the draft-eligible defensemen from the CHL, some are riskier picks than others. Here's the table of the CHL guys ranked in the consensus top-90 again:

History has shown us that guys like Zadorov and Morin are risky picks. If I were an NHL GM, I wouldn't draft either of these players unless I was convinced that they did possess enough offensive talent to become a big scorer in the CHL in their draft +1 and +2 seasons. James indicates that some scouts believe this is true of Zadorov, and I hinted in my last post that I think Madison Bowey could be a similar case from my limited viewings of the Kelowna Rockets.

I didn't think that "points are not the be-all-end-all" was something that I needed to explicitly state, and obviously there are a ton of other factors that go in to predicting whether a draft pick will be successful or not. However, as a broad brush analysis, CHL point totals still paint a pretty clear picture that guys with no offensive ability in major junior rarely turn into NHL players.

I hope this clarifies some of the statements I made previously, but I still think the rule of thumb conclusion that I came to in the last post is valid: defense at the junior level - be it size, physical play, shot blocking or something else - is overvalued, resulting in a disproportionate number of early round busts. As this is the case, in my mind at least, Nikita Zadorov probably shouldn't be more highly regarded than Ryan Pulock or Josh Morrissey, or maybe not even more highly than Shea Theodore or Jordan Subban just based on the historical examples of what successful NHL defensemen tend to do when they're in junior.

Wednesday, 26 June 2013

Defense, Defensemen, and the Draft

Pittsburgh's Scott Harrington. Future draft bust?

One of the things that drives me off-the-wall crazy about Hockey Canada at the junior level is the fetishization of stuff as nebulous as "heart" and "grit" and "toughness." Consequently, we get guys on our international junior teams who, when they appear to exhibit some of these intangible qualities, are lauded for their on-ice defensive abilities. Take, for example, Scott Harrington. A Penguins 2nd round pick in 2011, he was named the captain of the OHL champion London Knights this past season (leadership!), was a finalist for OHL defenseman of the year (defense!), and was guaranteed a spot on Canada's World Junior Championship team's blueline because he was there before because he blocked shots (heart!). Corey Pronman lists him as one of Pittsburgh's top-10 prospects, saying that his upside is a 3rd or 4th NHL defenseman due to being a "high-end thinker" with stellar defensive ability.

And yet he'll more than likely be out of NHL hockey by the time he's 25, doomed to a career bouncing around the minor leagues and Europe, mostly because he's not a very good hockey player, relatively speaking. It's a good thing I'm about to spend somewhere in the neighborhood of 1500 words backing up this claim, or else you might think I'm crazy.


Since the draft is just around the corner, I decided to take a look at all the CHL defensemen drafted in the first three rounds between 1999 and 2008 to see if there was any link between offensive production at the junior level and NHL success. Tyler Dellow has danced around this topic in the past, pointing out that depth roster players are mostly high draft picks that don't achieve star offensive status in the NHL, and that most defensive defensemen of consequence establish themselves as NHL regulars by their early twenties, but I've yet to see someone go through each pick and look at which ones were successful and which weren't, even though I'm sure such a study is out there.

My hypothesis was that to be a regular NHL defenseman, you probably had to be an outstanding player in the CHL at both ends of the ice. Consequently, guys drafted for their "defensive abilities" but couldn't score would make up the vast majority of early-round draft busts, at least when it came to defenders. I decided to use a 10-year window of drafts, with the most recent one I looked at being 2008. This gives the most recent batch of players I looked at five years to get acclimated to the NHL and start to stake a claim to a roster spot that they'll ideally hold down for years to come. I also stuck to the CHL for a couple of reasons: 1) it's the biggest feeder league of players to the NHL, and 2) it's probably the most familiar major junior league to me as well as any other Canadians (and certain Americans too) in terms of styles of play and levels of competition.

I evaluated players on the percent of NHL games played, relative to the total games available to play over that time. For example, since Dan Hamhuis was drafted in 2001, he was eligible to play in a total of 868 NHL games. He has played in 676 games over this span, meaning that he has appeared in 77.9% of NHL games. For the purposes of this study, I defined an "NHL regular" as someone who has appeared in 40% or more of the games they were eligible to play in. I then compared this percentage to their draft-year points-per-game and made a scatterplot. It's pretty rudimentary stuff, but the results are still pretty telling:

Note the massive cluster of points on the bottom left hand line of the plot, and how it magically disperses once it reaches about 0.55 Pts/GP. This tells me that while scoring in junior doesn't guarantee NHL success, not scoring in junior more often than not predicts NHL failure. To illustrate this point better, I divided the plot into quadrants (divisions at 40% NHL GP and 0.6 draft year Pts/GP) and made note of how many players fall into each category:

Based on historical data, a CHL defenseman taken early in the draft with fewer than 0.6 Pts/GP in his draft year, like Scott Harrington or Dylan McIlrath or Colten Teubert, only has about a 1 in 10 chance of even making the NHL as a full-time player. Going back to Harrington, only 3 players in the last 15 years have scored at a lower rate in their draft years and established themselves as NHL regulars: Mark Fistric, Tyler Myers, and Shea Weber. However, Fistric was never a big scorer and finds himself dangerously close to falling out of "NHL regular" status, while Weber and Myers grew into elite 19-year old scorers in their draft +2 seasons. Weber had 0.75 Pts/GP with Kelowna, and Myers put up an impressive 48 points in the NHL. Harrington still finds himself under 0.40 Pts/GP in his draft +2 season, which means he's tracking to be just like the other 91 guys who haven't ever made the show full-time.

So what does this mean for the draft on Sunday? Well, according to the consensus top-100 players, there are 15 CHL defensemen ranked in the first 3 rounds. They are as follows:

Just based on the stuff that was outlined above, you can say with a fair degree of certainty that Zadorov, Morin, Heatherington, Diaby and Kanzig all will not be long-term impact NHL players (coincidentally, all of these guys are 6'5 or taller, with the exception of 6'3 Dillon Heatherington) unless someone gets really, really lucky. It just goes to show the love affair that scouts have with nice bodies, as Dan Dorazio will tell you:

Other players like Mueller and Bowey should be regarded as very risky picks, too. They are probably much more skilled than the five big guys listed above, and I'd be anxious to see how they do in their draft +1 and draft +2 years. I'm no scout, but Bowey's meager 30 points is really surprising to me considering how great of a skater he is and how well I've seen him play with the puck.

Also, Corey Pronman calls Jordan Subban "risky" because of his small frame and defensive question marks, when he's probably a far, far less risky pick than either Zadorov of Morin. A coach can tell guys where to stand in the defensive zone, but a coach can't tell a guy to be talented. As the numbers have shown, the youngest Subban probably has a 50/50 shot at the NHL, whereas the odds are stacked against the 9th ranked Zadorov about 10/90.

The overriding lesson here, however, is don't draft a defensive defenseman early since it rarely, if ever, works in your favour. The few that do work out almost always put up big offensive numbers in junior before turning pro anyways, as I'll point out in just below. Of course, this philosophy probably extends out beyond the draft eligible CHL crop of talent, too. If I was an NHL GM, guys like 37th ranked Steve Santini out of the USNT (0G, 13A in 62GP) would be completely out of the question as well. I'm no expert, but the numbers seem to agree that guys who actively contribute to winning hockey games by scoring eventually turn out to be good at hockey. Really, this should all just be common sense.


As an aside, I always find it fascinating to look at the guys who are the anomalies. Here, they are the guys in the top left and lower right quadrants. How did the low-scoring guys make it? How did the high-scoring guys fail? Here are some of the explanations for why the outliers are outliers:

How they made it:
- Kris Letang (62nd overall, 2005) and Marc-Edouard Vlasic (35th overall, 2005) both had less than 0.5 Pts/GP in their draft years, but surged to over a point per game the very next season. Despite falling in the "unlikely to make it" category, they still both had a history of putting up big offensive numbers in junior.
- Marc Staal (12th overall, 2005), Dion Phaneuf (9th overall, 2003) and Travis Hamonic (53rd overall, 2008) all also had less than 0.5 Pts/GP in their draft years, but all reached above 0.6 Pts/GP by the time their draft +2 seasons had ended. Phaneuf and Hamonic were better than a point per game in their final years of junior.
- Braydon Coburn (8th overall, 2003) definitely had a down year in his draft season. He only scored 19 points in 53 games in 02-03, but had more than 0.5 Pts/GP every other year of his WHL career.

How they missed:
- Andrew Campbell (74th overall, 2008) was drafted as an overage '85 birthday in the '87 draft class after scoring 35 points in 68 games. In his first draft eligible season, he had 4 points. Danny Syvret (81st overall, 2005) is a similar case.
- Jesse Lane (91st overall, 2002) had a very promising 0.84 Pts/GP in his draft year, however decided to quit hockey to pursue his studies once Carolina declined to offer him an entry level contract after a strong junior career.
- Ivan Vishnevskiy (27th overall, 2006) is the only player in the bottom right group to leave for the KHL. He most recently played in North America for Chicago's AHL affiliate.
- Some players were drafted high after an obvious outlier season. Examples include Alex Plante (15th overall, 2007), Dustin Kohn (46th overall, 2005), Martin Vagner (26th overall, 2002) and Josh Godfrey (34th overall, 2007).
- Other players in this bottom right group may just have yet to establish themselves as full-time NHL players. Guys who are close to full-time duty include Thomas Hickey (4th overall, 2007) and Bob Sanguinetti (21st overall, 2006).

Tuesday, 4 June 2013

Why Are We So Damn Ignorant?

For some reason, rather than continually explore new avenues and attempt to enhance our understanding of how hockey actually works, we (I'm using "we" fairly broadly here) seem to instead bury our heads in the sand, dig our heels in, and resist new ideas and concepts. This is especially prevalent if these new ideas are in the form of acronyms or contain decimals. Tonight, my absolute favourite TV hockey personality Steve Kouleas left Twitter with a gem of an anti-thinking tweet, and wouldn't you know it, the analyblogotwittosphere or whatever we're calling it jumped all over him. I'm not going to rip Kouleas here because I find him insufferable and it'll turn into a rant, but I'll address some stuff that really has been bothering me; mainly just how to meld stats with more "traditional" analysis done by the talking heads in a way that makes the staple intermission or pregame or postgame panel much more informative or insightful.

But first, a rundown of what happened, just to prove I'm not setting up a strawman here. I don't know what prompted it, but ol' Stevie just felt it necessary to throw this out there:

That's not even grammar'd correctly, and the excessive use of ellipses basically undermines any point he may have been trying to make since they show he has the punctuation skills of a second grader, but that's irrelevant. The main point is that this is a major host on perhaps the world's most influential television network when it comes to meaningful hockey analysis is completely dismissing the underlying school of thinking that seems to be slowly welling up beneath the traditional mainstream narratives. You would think that, as a network dedicated to feeding hockey programming en masse to the insatiable Canadian populous, TSN would also strive to deliver the most insightful programming too. For the most part, I think they're the best of the three major Canadian networks, with guys like McKenzie, Dreger, LeBrun, and even Ferraro and Ward at their disposal.

TH2N seems wildly divergent from this perception however. It's like something CBC's panel would do sans Friedman. "LOOK HOCKEY TALK LOUD NOW," or something like that. Aside from the banal shouting that goes on between Kouleas and Button (whom I think is horribly miscast), there's also the fact that berating alternative viewpoints seems to be the norm from TH2N regulars, as shown by STATS GUY tonight and Steve Simmons a while ago:

Other than the obvious "I claim to understand your ideas completely but really I have no idea" stuff that's going on here, I don't think that taking the stance of "it will never work!" is either productive or even smart. As Broad Street Hockey's Eric Tulsky pointed out earlier tonight, a lot of smart people have spent a lot of time looking at this stuff. Dismissing it outright works on the same logic as denying climate change or suggesting that smoking doesn't cause cancer. Unless you can defend your argument that Corsi is "ridiculous nonsense" with actual strong evidence, your opinion is invalid, no matter how how large the pulpit you preach from (and as we all know, a one-game sample that lacks any sort of context is flimsy evidence at best).


This all brings me to my main gripe with how hockey is being analyzed right now: incorporating stats, educating your viewership, and making your product better would be incredibly easy and I don't understand why it's not being done. Take Hockey Night in Canada for example. On one hand, you have Elliotte Friedman getting free reign to use in an intermission bit, and then on the other Glenn Healy claims that Pavel Datsyuk isn't that talented. Fortunately, you only need one guy to know what he's talking about to present a segment in a way that makes everyone come across as intelligent. Imagine an intermission segment on HNiC as follows:

MacLean: "The Boston Bruins are up 2-0 on the Penguins. Who could have seen this coming, Elliotte?"

Friedman: "Well Ron, according to some of the most reliable measures we have on predicting future success, the Bruins were by far and away the better team than Pittsburgh through the regular season. We all know that Detroit built a decade of success on the back of puck possession, and the best measures of puck possession we have at our disposal are a couple of stats called Corsi and Fenwick. *brief explanation of both stats followed by a chart showing the massive difference between BOS and PIT in these measures.* So while this fast 2-0 lead for the Bruins may be surprising to some, it really shouldn't be."

MacLean: "So PJ, what are the Bruins doing in these playoffs to keep up this great puck possession play?"

Stock: "It's all about the 50-50 puck battles. Here you got a guy like Bergeron who's a great player and look at him work against Crosby here *hilight reel of Bergeron and other Bruins fighting for pucks along the boards.* You win with guys like Bergeron. They turn pucks over and give it to your team and help you keep the puck for more of the game."

MacLean: "But coaching has an important role too, Kevin?"

Weekes: "Absolutely. You look at how Boston is structured in their defensive end, and they keep pucks to the outside and cause turnovers. Pittsburgh on the other hand is really loose. Look at how poor their coverage is on this Krejci goal! They're chasing the Bruins all over the ice, and never in a good position to get the puck back if they turn it over. As a result, Boston has the puck more and they're winning."

A segment like that introduces and explains a statistical concept, and then outlines the factors and events in a hockey game that go into that stat. Not only does it simply run through numbers, but it marries the newer wave of analytics with the traditional narrative in a way that should be more easily understandable to viewers. Stats aren't intrinsically divorced from the old-school way of looking at hockey, but mainstream analysts really have yet to make the connection between what we see going on, and what we measure going on. It's not a hard connection to make, but I really hope it happens sooner rather than later. Hockey needs more Friedmans and less Kouleases.


One last thing: my favourite part of this whole thing was when Kouleas challenged Twitter to "go to a coach or GM and ask them a question on it" and both a coach and a GM immediately responded telling him that he was full of shit. I really hope the Soo Greyhounds win a Memorial Cup soon since it would be nice to see an organization that embraces analytics so openly get rewarded for it.