Calgary Flames

Introducing TWC’s Player Offensive Evaluation Tool

This player evaluation tool was developed solely using Google Sheets and aims to give an overall picture of a player’s offensive output. The model is fairly basic, and is not without its flaws, but can be used as a quick off-the-cuff evaluator or comparison tool. If you’d like to skip the explanation and go right to the tool, click here.

Before we continue, I’d like to give a huge shoutout to Evolving-Hockey (@evolvinghockey) for the inspiration for this tool. Their RAPM model is fascinating and the output of my model mirrors theirs because of how clean it is. Please check out their website and Twitter page; they’re definitely one of the best hockey analytics teams out there right now and create a lot of really cool things for the community.

how does this model work?

Unlike the EvolvingWild RAPM charts which paint both offensive and defensive pictures, this TWC model only focuses on offense. That’s because offensive statistics are much more readily available, easier to understand, and simply easier to deal with.

Nine different statistics were used in this analysis: LDCF%, MDCF%, HDCF%, iLDCF/60, iMDCF/60, iHDCF/60, Goals/60, First Assists/60, and Second Assists/60. To explain how each stat was adjusted for use in the model, I’ll go through one stat, iLDCF/60, in detail for my current favourite player: Calgary Flames winger Matthew Tkachuk. Every statistic discussed below was taken at 5v5.

In this model, two separate data sets were created, one for forwards and one for defensemen. Each forward is only evaluated against all other forwards, and each defenseman is only evaluated against all other defensemen. This helps adjust the data set for the inherently lower offense created by defensemen, and at the same time doesn’t unfairly reward lower contributing forwards that might be much higher in the set due to the presence of defensemen. This does however make it difficult to accurately compare forwards against defensemen. When using this tool, it’s important to only compare forwards to other forwards, and defensemen to other defensemen.

Step 1: Find each player’s iLDCF/60 value

To reduce single season bias, data from the 2016-17, 2017-18, and 2018-19 seasons were gathered. Each player’s iLDCF/60 value was created using a weighted average over all three seasons, with each season worth twice as much as the previous one. Values from the 2016-17 season were weighted 1, 2017-18 season 2, and 2018-19 season 4. The total is divided by the sum of the weights: 1 + 2 + 4 = 7.

However, this only works for players with three years of data. For those players who have only played in the previous two seasons, the 2017-18 and 2018-19 seasons were used with 2018-19 being worth double 2017-18. For players with only the 2018-19 season, their results from that season were used on its own. If a player didn’t play in the 2019 season, they were omitted from the data set. This does create some error in the set, as players who had an incredible 2018-19 season look like the best players in the world. Though only players with at least 20 games on average over their previous seasons are shown on the charts, this is one area that can be improved. For an example of this, look up Carolina Hurricanes’ forward Warren Foegele in the tool. Is he really better than Connor McDavid? Probably not.

Regardless, this process was used and a weighted average value is calculated for every player in the data set. This is one limitation of this tool in its current form (discussed later).

Tkachuk’s iLDCF/60 was calculated as follows:

YeariLDCF/60Weighting
2016-174.741
2017-185.382
2018-196.044
TOTAL5.677

step 2: find normal distribution value

It doesn’t make sense to use raw data values in this situation. For a percentage based statistic in hockey, the average is generally close to 50 and the the majority of data points are not more than 20 points higher or lower than the average. This limited range makes it important to adjust the data to fully understand how much better a single percentage point is compared to a counting statistic like Goals or rate statistic like iLDCF/60.

In the case of iLDCF/60, the league weighted average over the past three years is 5.86, and though the overall range is 0 to 17.57, 90% of all data points are between 2.45 and 9.73. Therefore, to adjust each statistic, the normal distribution for each statistic was calculated.

To carry out this calculation, the mean and standard deviation must be known. Fortunately, these are relatively easy calculations. For iLDCF/60, the mean is 5.86 and the standard deviation is 2.4.

For Tkachuk, his performance led to a normal distribution value of 0.65424:

iLDCF/60League MeanStandard DeviationNormal Distribution
5.674.93299 1.84656 0.65424

step 3: Adjust the normal distribution

Unfortunately, a further adjustment is required. Certain statistics have a large number of data points at the very bottom or very top of the set. For example, out of 1191 players (in the overall data set, not just the forwards data set we’ve been using for Tkachuk thus far), 403 have 0 Goals/60, resulting in their normal distribution to be 0.25242. This drastically skews the data upwards, as it looks like all 403 of these players have a higher Goals/60 output than 25.2% of their peers. For the percentage based statistics, the range was generally 0-1 because there were players that recorded 0% in each statistic.

To eliminate this issue, the normal distribution is adjusted based on the minimum and range of each individual statistic. The adjustment takes each player’s normal distribution, subtracts the minimum normal distribution for that statistic, and then divides by the range of normal distributions.

For iLDCF/60, Tkachuk’s normal distribution is 0.65424, the minimum normal distribution is 0.00378, and the maximum is 1.00000. Therefore, the adjusted normal distribution is:

(0.65424 – 0.00378)/(1.00000-0.00378) = 0.65293

possession

So that’s how values are calculated. Now we look at which metrics are included in the model.

The first statistical category shown in the charts is possession. Possession is an amalgamation of LDCF%, MDCF%, and HDCF%. The adjusted normal distributions for each are determined based on the method described above, are then multiplied by an assigned weighting, and finally summed together.

The weightings for these statistics are based on the league-wide shooting percentage for each danger range. For example, for low danger, the shooting percentage was calculated by taking the total goals scored from low danger, and dividing by the total number of shots taken from low danger. The percentages are as follows:

StatisticLeague-wide Shooting Percentage
LDCF7.812
MDCF19.632
HDCF54.424

Tkachuk’s overall possession score breaks down like this:

Statistic2016-172017-182018-19Weighted AverageNormal DistributionAdjusted Normal DistributionWeightingTotal
LDCF%56.4857.6657.7957.570.924500.924507.8127.22221
MDCF%56.0656.9258.2357.550.873020.8730319.63217.13939
HDCF%54.1754.7253.5453.970.741870.7418754.42440.37574

Summing the total for all three possession stats gives the total possession score for Tkachuk: 64.73733.

This process is carried out for each player and from this set of possession scores the standard deviation is calculated: 13.58181. As well, the score for the “average player” is calculated. The “average player” is defined as a player with adjusted normal distributions of 50% in all categories, i.e.: this player is the median of the data set. The chart output compares the selected player to the “average player”. The average player’s possession score is: 40.934.

The specified player’s possession is subtracted by the average player score, and then divided by the standard deviation. This shows how many standard deviations the player’s score is above or below the average player. An exactly average player will show a bar with a zero value, indicating they are neither above nor below the average.

Tkachuk’s chart output is therefore:

(64.73733 – 40.934) / 13.58181 = 1.753

Individual Shot Generation

The second output on the chart is individual shot generation. This amalgamated statistic uses iLDCF/60, iMDCF/60, and iHDCF/60 and follows the exact same process used above to calculate possession. Because these stats are also derived from breaking down shots based on danger level, the same weightings apply here as they did above.

Tkachuk’s overall individual shot generation score breaks down like this:

Statistic2016-172017-182018-19Weighted AverageNormal DistributionAdjusted Normal DistributionWeightingTotal
iLDCF/604.745.386.045.670.654240.652937.8125.10071
iMDCF/603.144.413.213.540.463650.4565919.6328.96386
iHDCF/604.14.484.324.330.809560.8081554.42443.98284

Summing the total for all three shot generation stats gives the total individual shot generation score for Tkachuk: 58.04741.

The average player has an individual shot generation score of: 40.934. The standard deviation is 15.80223. Tkachuk’s chart output is therefore:

(58.04741 – 40.934) / 15.80223 = 1.083

Scoring

The final output on the chart is scoring. This stat is an amalgamation of Goals/60, First Assists/60, and Second Assists/60. Once again, the same process as above was used. The main difference for scoring is the weightings used for the three statistics.

Whereas the previous two sections had their weightings come from the league-wide shooting percentages from the three danger zones, the weightings for scoring are more subjective. I chose to assign total goals and total assists the same weighting of 100. Within the assists category, I chose to assign an 85 weighting to first assists and 15 to second assists. This rewards first assists much more than second assists, while keeping the total points stats the same overall weight.

Tkachuk’s overall scoring score breaks down like this:

Statistic2016-172017-182018-19Weighted AverageNormal DistributionAdjusted Normal DistributionWeightingTotal
Goals/600.580.840.940.860.616010.5083810050.83782
First Assists/601.020.450.830.750.782240.763548564.90120
Second Assists/600.450.390.50.460.732140.709761510.64641

Summing the total for all three scoring stats gives the total scoring score for Tkachuk: 126.38543.

The average player has a scoring score of: 100. The standard deviation is 32.21604. Tkachuk’s chart output is therefore:

(126.38543 – 100) / 32.21604 = 0.819

Limitations

As stated above, this is not a perfect tool and does have its flaws.

  1. Overvaluing or undervaluing players with limited sample sizes
    • Players like Foegele look like generational talents due to few accrued seasons with limited games played. This can happen in the opposite way, where good players look like terrible ones due to the same issues
  2. Comparing forwards to forwards and defensemen to defensemen
    • This is less of a limitation than a purposeful adjustment, but it is difficult to paint an accurate picture when comparing forwards to defensemenIt’s probably not valuable to do this comparison anyway, but it’s still a limitation of the model
  3. Subjective weighting assignment for scoring metrics
    • Whereas the possession and shot generation weightings were derived from real league-wide data, the weightings for scoring were created subjectively
    • This does place a heavy emphasis on goals, and especially on first assists over second assists

There are definitely other limitations but again, this is a basic model that is best used as an initial evaluation tool rather than an all-knowing comprehensive one.

Some Fun Comparisons

2018-19 Norris Trophy – Mark Giordano vs. Brent Burns:

2018-19 Hart Trophy – Nikita Kucherov vs. Sidney Crosby:

Best Center? – Connor McDavid vs. Auston Matthews:

Biggest trade of the summer – Milan Lucic vs. James Neal:

The most average and the worst found – Derek Stepan and Chris Thorburn:

Check it out!

If you haven’t tried it already, check out the tool here. If you have any feedback, let us know @wincolumnblog or @karimkurji.

Back to top button

Discover more from The Win Column

Subscribe now to keep reading and get access to the full archive.

Continue reading