#ISOLHAC: Breaking down the data visualisation process — A live-coding session with R

On Saturday May 9, 2020, I had the pleasure of running a data visualisation workshop as a part of #ISOLHAC (isol-hac.com), an online hockey analytics conference hosted by Alyssa Longmuir (@alyssastweeting).

A part of the conference included a Visualisation Power Hour, where myself and Sean Tierney (@ChartingHockey) each shared our data visualisation process using R and Tableau, respectively.

So without further ado let’s go over the happenings of the Visualisation Power Hour.

The R Workshop

Get started by downloading the slides to see the topics covered with the workshop.

Workshop Takeaways

1. Know where to obtain hockey data

Most of my visualisations are made by using the following sources:

Natural Stat Trick, Evolving-Hockey, MoneyPuck, Hockey-Reference, and NHL.

At times, additional sources are required, such as using Wikipedia to find some additional details that aren’t readily available. Other databases with hockey stats will be useful as well.

Data cleanup is an important part in making a visualisation with R. While cleanup can be performed with R code, sometimes it’s a whole lot easier to do it manually. Sure, it’s a bit more work, but there’s nothing wrong with that! It’s always good to know your own limitations. If you have time to learn the process, then by all means you should. However, sometimes the quick and dirty method of manual work is all you need.

2. Getting started with R

I do all of my work using RStudio, and all of my visualisations are made with the tidyverse, specifically with “ggplot2”.

Starting a visualisation can be as easy as finding a source of inspiration and knowing what types of data works with different types of graphs. My go-to source for getting started with a chart is the R Graph Gallery. It’s a great resource for inspiration, but also for the exact code required for making a plot. It doesn’t get easier than that.

A big part of coding is knowing what to search for when running into errors or when wanting to implement a change. A lot of the time, it’s flipping back and forth between R and Google to find one line of code at a time. There’s nothing wrong with that and it’ll often lead to great resources for expanded code.

Once a plot is completed, it can be exported as a PDF and tweaked using Adobe Illustrator for additional customisation.

3. Combine the visualisation with Adobe Illustrator

This is an optional step, and most of the time, a visualisation can be fully completed just with R. However, it’s nice to make additional tweaks to the chart to make it unique or give it that extra oomph factor.

Some things that can be readily accomplished with Illustrator when working with R outputs that are PDFs are things such as font changes, component realignments, and inclusion of personal branding like logos.

These steps are by no means necessary, but it’s up to the designer to make that decision for themselves. I’m of the mindset that additional visual considerations can really elevate the final product.

4. Final Considerations

Some things to consider during the data visualisation process:

  • Does the chart convey the point you’re trying to make?
  • Did you acknowledge your sources/inspirations?
  • Is the chart accessible, particularly for the visually impaired?
  • Is the data presentation truthful?

First, an effective data visualisation should allow the reader to come to a conclusion. You can guide them there, but it should be apparent nonetheless.

It’s good practise to include your data sources, and also showing where you got your inspiration/how you made your chart. Acknowledgements gives credit to the original creator and to the data sources.

It’s good to consider the accessibility of a chart. Things such as selecting colour palettes that are colour-blind friendly, or including alt-text in the image can make a data visualisation a lot more accessible for your audience.

And lastly, it’s awfully easy to misconstrue data, especially with data visualisations. I didn’t mention this during the workshop, but a great resource for this is Alberto Cairo’s (@AlbertoCairo) “How Charts Lie — Getting Smarter about Visual Information“.

5. The final Product

Over the span of the workshop, we were able to go a basic data visualisation creation process, and were able to make a quick chart at the end of it all: a comparison of expected goals for percentages between David Rittich and Cam Talbot over the 2019-20 season.

The R code for making this plot (without Illustrator tweaks) is here:

I’ll be the first to admit, the data I used wasn’t optimal, but it did show the audience how to start from nothing and arrive at a presentable chart. That was the main goal of the workshop. I wanted new R users to be able to follow along and create their own plots, and that was exactly what happened! Check out some fantastic examples below:

Conclusion

So there you have it! A quick and easy way to get started with R to make hockey visualisations. Now go and check out Sean’s Tableau workshop as well and deep dive into the world of hockey data visualisations! There’s no excuses anymore!

Once again, thanks to Alyssa for setting up ISOLHAC. She ran the whole daytime event while pulling an all-nighter in Australia, so don’t forget to send her lots of praise and thanks. Check out the rest of ISOLHAC by visiting isol-hac.com.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s