# Graphs – beauty and truth (with apologies to Keats)

## A good graph is elegant

I really like graphs. I like the way graphs turn numbers into pictures. A good graph is elegant. It uses a few well-placed lines to communicate what would take a paragraph of text. And like a good piece of literature or art, a good graph continues to give, beyond the first reading. I love looking at my YouTube and WordPress graphs. These graphs tell me stories. The WordPress analytics tell me that when I put up a new post, I get more hits, but that everyday more than 1000 people read one of my posts. The YouTube analytics tell me stories about when people want to know about different aspects of statistics. It is currently the end of the North American school year, and the demand is for my video on Choosing which statistical test to use. Earlier in the year, the video about levels of measurement is the most popular. And not many people view videos about statistics on the 25^{th} of December. I’m happy to report that the YouTube and WordPress graphs are good graphs.

Spreadsheets have made it possible for anyone and everyone to create graphs. I like that graphs are easier to make. Drawing graphs by hand is a laborious task and fraught with error. But sometimes my heart aches when I see a graph used badly. I suspect that this is when a graphic artist has taken control, and the search for beauty has over-ridden the need for truth.

Three graphs spurred me to write this post.

## Graph One: Bad-tasting Donut on house occupation

The first was on a website to find out about property values. I must have clicked onto something to find out about the property values in my area, and was taken to the qv website. And this is the graph that disturbed me.

Sure it is pretty – uses pretty colours and shading, and you can find out what it is saying by looking at the key – with the numbers beside it. But a pie or donut chart should not be used for data which has inherent order. The result here is that the segments are not in order. Or rather they are ordered from most frequent to least frequent, which is not intuitive. Ordinal data is best represented in a bar or column chart. To be honest, most data is best represented in a bar or column chart. My significant other suggested that bar charts aren’t as attractive as pie charts. Circles are prettier than rectangles. Circles are curvy and seem friendlier than straight lines and rectangles. So prettiness has triumphed over truth.

## Graph Two: Misleading pictogram (a tautology?)

It may be a little strong to call bad communication lack of truth. Let’s look at another example. In a way it is cheating to cite a pictogram in a post like this. Pictograms are the lowest form of graph and are so often incorrect, that finding a bad one is easier than finding a good one. In the graph below of fatalities it is difficult to work out what one little person represents.

A quick glance, ignoring the numbers, suggests that the road toll in 2014 is just over half what it was in 2012. However, the truth, calculated from the numbers, is that the relative size is 80%. 2012 has 12 people icons, representing 280 fatalities. One icon is removed for 2013, representing a drop of 9 fatalities. 2011 has one icon fewer again, representing a drop of 2 fatalities. There is so much wrong in the reporting of road fatalities, that I will stop here. Perhaps another day…

## Graph Three: Mysterious display on Household income

And here is the other graph that perplexed me for some time. It came in the Saturday morning magazine from our newspaper, as part of an article about inequality in New Zealand. Anyone who reads my blog will be aware that my politics place me well left of centre, and I find inequality one of the great ills of the modern day. So I was keen to see what this graph would tell me. And the answer is…

I have no idea. Now, I have expertise in the promulgation of statistics, and this graph stumped me for some time. Take a good look now, before I carry on.

I did work out in the end, what was going on in the graph, but it took far longer than it should. This article is aimed at an educated but not particularly statistically literate audience, and I suspect there will be very few readers who spent long enough working out what was going on here. This graph is probably numerically correct. I had a quick flick back to the source of the data (who, by the way, are not to be blamed for the graph, as the data was presented in a table) and the graph seems to be an accurate depiction of the data. However, the graph is so confusing as to be worse than useless. Please post critiques in the comments. This graph commits several crimes. It is difficult to understand. It poses a question and then fails to help the reader find the answer. And it does not provide insights that an educated reader could not get from a table. In fact, I believe it has obscured the data.

Graphs are the main way that statistical analysts communicate with the outside world. Graphs like these ones do us no favours, even if they are not our fault. We need to do better, and make sure that all students learn about graphs.

## Teaching suggestion – a graph a day

Here is a suggestion for teachers at all levels. Have a “graph a day” display – maybe for a month? Students can contribute graphs from the news media. Each day discuss what the graph is saying, and critique the way the graph is communicating. I have a helpful structure for reading graphs in my post: There’s more to reading graphs than meets the eye;

Here is a summary of what I’ve said and what else I could say on the topic.

## Thoughts about Statistical Graphs

- The choice of graph depends on the purpose
- The text should state the purpose of the graph
- There is not a graph for everything you wish to communicate
- Sometimes a table communicates better than a graph
- Graphs are part of the analysis as well as part of the reporting. But some graphs are better to stay hidden.
- If it takes more than a few seconds to work out what a graph is communicating it should either be dumped or have an explanation in the text
- Truth (or communication) is more important than beauty
- There is beauty in simplicity
- Be aware than many people are colour-blind, or cannot easily differentiate between different shades.

## Feedback from previous post on which graph to use

Late last year I posted four graphs of the same data and asked for people’s opinions. You can link back to the post here and see the responses: Which Graph to Use.

The interesting thing is not which graph was selected as the most popular, but rather that each graph had a considerable number of votes. My response is that it depends. It depends on the question you are answering or the message you are sending. But yes – I agree with the crowd that Graph A is the one that best communicates the various pieces of information. I think it would be improved by ordering the categories differently. It is not very pretty, but it communicates.

I recently posted a new video on YouTube about graphs. It is a quick once-over of important types of graphs, and can help to clarify what they are about. There are examples of good graphs in there.

I have written about graphs previously and you can find them here on the Collected Works page.

I’m interested in your thoughts. And I’d love to see some beautiful and truthful graphs in the comments.

If you haven’t already seen it, I can recommend Alberto Cairo’s “The Functional Art”, which has some excellent ideas on how to represent data in graphical form. (The sequel “The Truthful Art” is probably good too but I haven’t read it yet.)