Graphs – beauty and truth

Graphs – beauty and truth (with apologies to Keats)

A good graph is elegant

I really like graphs. I like the way graphs turn numbers into pictures. A good graph is elegant. It uses a few well-placed lines to communicate what would take a paragraph of text. And like a good piece of literature or art, a good graph continues to give, beyond the first reading. I love looking at my YouTube and WordPress graphs. These graphs tell me stories. The WordPress analytics tell me that when I put up a new post, I get more hits, but that everyday more than 1000 people read one of my posts. The YouTube analytics tell me stories about when people want to know about different aspects of statistics. It is currently the end of the North American school year, and the demand is for my video on Choosing which statistical test to use. Earlier in the year, the video about levels of measurement is the most popular. And not many people view videos about statistics on the 25th of December. I’m happy to report that the YouTube and WordPress graphs are good graphs.

Spreadsheets have made it possible for anyone and everyone to create graphs. I like that graphs are easier to make. Drawing graphs by hand is a laborious task and fraught with error. But sometimes my heart aches when I see a graph used badly. I suspect that this is when a graphic artist has taken control, and the search for beauty has over-ridden the need for truth.

Three graphs spurred me to write this post.

Graph One: Bad-tasting Donut on house occupation

The first was on a website to find out about property values. I must have clicked onto something to find out about the property values in my area, and was taken to the qv website. And this is the graph that disturbed me.

Graphs named after food are seldom a good idea

Sure it is pretty – uses pretty colours and shading, and you can find out what it is saying by looking at the key – with the numbers beside it. But a pie or donut chart should not be used for data which has inherent order. The result here is that the segments are not in order. Or rather they are ordered from most frequent to least frequent, which is not intuitive. Ordinal data is best represented in a bar or column chart. To be honest, most data is best represented in a bar or column chart. My significant other suggested that bar charts aren’t as attractive as pie charts. Circles are prettier than rectangles. Circles are curvy and seem friendlier than straight lines and rectangles. So prettiness has triumphed over truth.

Graph Two: Misleading pictogram (a tautology?)

It may be a little strong to call bad communication lack of truth. Let’s look at another example. In a way it is cheating to cite a pictogram in a post like this. Pictograms are the lowest form of graph and are so often incorrect, that finding a bad one is easier than finding a good one. In the graph below of fatalities it is difficult to work out what one little person represents.

What does one little person represent?

A quick glance, ignoring the numbers, suggests that the road toll in 2014 is just over half what it was in 2012. However, the truth, calculated from the numbers, is that the relative size is 80%. 2012 has 12 people icons, representing 280 fatalities. One icon is removed for 2013, representing a drop of 9 fatalities. 2011 has one icon fewer again, representing a drop of 2 fatalities. There is so much wrong in the reporting of road fatalities, that I will stop here. Perhaps another day…

Graph Three: Mysterious display on Household income

And here is the other graph that perplexed me for some time. It came in the Saturday morning magazine from our newspaper, as part of an article about inequality in New Zealand. Anyone who reads my blog will be aware that my politics place me well left of centre, and I find inequality one of the great ills of the modern day. So I was keen to see what this graph would tell me. And the answer is…

See how long it takes for you to find where you appear on the graph. (Pretending you live in NZ)

I have no idea. Now, I have expertise in the promulgation of statistics, and this graph stumped me for some time. Take a good look now, before I carry on.

I did work out in the end, what was going on in the graph, but it took far longer than it should. This article is aimed at an educated but not particularly statistically literate audience, and I suspect there will be very few readers who spent long enough working out what was going on here. This graph is probably numerically correct. I had a quick flick back to the source of the data (who, by the way, are not to be blamed for the graph, as the data was presented in a table) and the graph seems to be an accurate depiction of the data. However, the graph is so confusing as to be worse than useless. Please post critiques in the comments. This graph commits several crimes. It is difficult to understand. It poses a question and then fails to help the reader find the answer. And it does not provide insights that an educated reader could not get from a table. In fact, I believe it has obscured the data.

Graphs are the main way that statistical analysts communicate with the outside world. Graphs like these ones do us no favours, even if they are not our fault. We need to do better, and make sure that all students learn about graphs.

Teaching suggestion – a graph a day

Here is a suggestion for teachers at all levels. Have a “graph a day” display – maybe for a month? Students can contribute graphs from the news media. Each day discuss what the graph is saying, and critique the way the graph is communicating. I have a helpful structure for reading graphs in my post: There’s more to reading graphs than meets the eye; 

Here is a summary of what I’ve said and what else I could say on the topic.

Thoughts about Statistical Graphs

  • The choice of graph depends on the purpose
  • The text should state the purpose of the graph
  • There is not a graph for everything you wish to communicate
  • Sometimes a table communicates better than a graph
  • Graphs are part of the analysis as well as part of the reporting. But some graphs are better to stay hidden.
  • If it takes more than a few seconds to work out what a graph is communicating it should either be dumped or have an explanation in the text
  • Truth (or communication) is more important than beauty
  • There is beauty in simplicity
  • Be aware than many people are colour-blind, or cannot easily differentiate between different shades.

Feedback from previous post on which graph to use

Late last year I posted four graphs of the same data and asked for people’s opinions. You can link back to the post here and see the responses: Which Graph to Use.

The interesting thing is not which graph was selected as the most popular, but rather that each graph had a considerable number of votes. My response is that it depends.  It depends on the question you are answering or the message you are sending. But yes – I agree with the crowd that Graph A is the one that best communicates the various pieces of information. I think it would be improved by ordering the categories differently. It is not very pretty, but it communicates.

I recently posted a new video on YouTube about graphs. It is a quick once-over of important types of graphs, and can help to clarify what they are about. There are examples of good graphs in there.


I have written about graphs previously and you can find them here on the Collected Works page.

I’m interested in your thoughts. And I’d love to see some beautiful and truthful graphs in the comments.

Framework for statistical report-writing

I’ve been pondering what needs to happen for a student to be able to produce a good statistical report. This has been prompted by an informal survey I conducted among teachers of high school statistics in New Zealand. Because of the new curriculum and assessments, many maths teachers are feeling out of their depth, and wondering how to help their students. I asked teachers what they found most challenging in teaching statistics. By far the most common response was related to literacy or report-writing.

Here is a sample of teacher responses when asked what they find most challenging:

  • Teaching students how to write.
  • Helping students present their thoughts and ideas in a written report.
  • Writing the reports for assessment- making this interesting.
  • Helping students use the statistical language required in assessments.
  • Getting students to adequately analyse and write up a report.
  • Trying to think more like an English teacher than a Mathematics teacher

These comments tend to focus on the written aspect of the report, but I do wonder if the inability to write a coherent report is also an indicator of some other limitations.

The following diagram outlines the necessary skills and knowledge to complete a good statistical report. In addition the student needs the character traits of critical thinking, courage and persistence in order to take the report through to completion.

A framework for analysing what needs to happen in the production of a good statistical report.

A framework for analysing what needs to happen in the production of a good statistical report.

Basic Literacy

Though not sufficient on their own, literacy skills are certainly necessary. It is rather obvious that being able to write is a prerequisite to writing a report. In particular we need to be able to write in formal language. One common problem is the tendency to omit verbs, thus leaving sentences incomplete.

Understand concepts

Students must understand correctly the statistical concepts underlying the report. For example, if they are not clear what the median, mean and quartiles express, it is difficult to write convincingly about them, or indeed to report them using correct language. When students are unable to write about a concept, it may indicate that their understanding is weak.

Be familiar with graphs and output

These days students do not need to draw their own graphs or calculate statistics by hand, but do need to know what graphs and analysis are appropriate for their particular data and research question. And they need to know how to read and interpret the graphs.

Know what to look for in graphs and output

This differs from the previous aspect in that it is a higher level of acquaintanceship with the medium. For example in a regression, students need to know to look for heteroscedasticity, or outliers with undue influence. In time series students know to look for unusual spikes that occur outside the regular pattern. In comparing boxplots students look at overlap. This familiarity can only come through practice.

Understand the importance of context

What is an important feature in one context, may not be so in a different context. This can be difficult for students and instructors who are at home with the purity of mathematics, in which the context can often be ignored or assumed away. Unless students understand the importance of context, often contained within the statistical enquiry process, they are unlikely to invest time in understanding the context and looking at the relationship between the model and the real world problem.

Understand the context

Sometimes the context is easily understood by students, related to their daily life or interests such as sport, music or movies. However there are times when students need to become more conversant with an unfamiliar context. This is entirely authentic to the life of a statistician, particularly a consulting statistician. We are often faced with unfamiliar contexts. Over the years I have become more knowledgeable about areas as diverse as hand injuries, scientific expeditions to Antarctica, bank branch performance, prostate cancer screening and chicken slaughtering methods. Even though we may work with an expert in the field of the investigation, we must develop a working knowledge of the field and the terminology ourselves.

Be familiar with terminology

Part of statistical literacy is to be able to use the language of statistics. There are words that have particular meaning in a statistical context, such as random, significant, error and population. It is not acceptable to use statistical terms incorrectly in a statistical report. Statistics is a peculiar mixture of hand-waving and precision, and we need to know when each is needed. There is also a fair degree of equivocation, and students should be familiar with expressions such as “it appears…”, “there is evidence that”, and “a possible implication might be…”

These other aspects lead into the three main ideas:

Know what to include and exclude

This is where checklists can come in handy for students to make sure they have all the relevant details, and that they do not include unnecessary details. My experience is that there is a tendency for students to write a narrative of how they analysed the data, step by painful step. (I call it “what I did in the holidays.”) Students can also gain from seeing good exemplars that provide the results, without unnecessary detail about the process.

Express correct ideas in appropriate written language

This is probably the most obvious requirement for a good report. This comes from basic literacy, knowing what to look for, familiarity with the terminology and understanding of the concepts.

Relate the findings to the context

Our report must answer the investigative question or research questions. Each of the statistical findings must be related to the context from with the data has been taken. This must be done with the right amount of caution, not with bold assertions about results that the data only hints at.

If these three are happening well, then a good written report is on its way!

Developing skills

So how do we make sure students have all the requisite skills and knowledge to create a good statistical report? To start with we can use the frame work provided here to diagnose where there may be gaps in the students’ knowledge or skills. Students themselves can use this as a way to find out where their weaknesses may be.

Then students must read, talk and write, over and over. Read exemplars, talk about graphs and output and write complete sentences in the classroom. All data must be real, so that students get practice at drawing conclusions about real people and things.

This framework is a work in progress and I would be pleased to have suggestions for improvement.

A Statistics-centric curriculum

Calculus is the wrong summit of the pyramid.

“The mathematics curriculum that we have is based on a foundation of arithmetic and algebra. And everything we learn after that is building up towards one subject. And at top of that pyramid, it’s calculus. And I’m here to say that I think that that is the wrong summit of the pyramid … that the correct summit — that all of our students, every high school graduate should know — should be statistics: probability and statistics.”

Ted talk by Arthur Benjamin in February 2009. Watch it – it’s only 3 minutes long.

He’s right, you know.

And New Zealand would be the place to start. In New Zealand, the subject of statistics is the second most popular subject in our final year of schooling, with a cohort of 12,606. By comparison, the cohort for  English is 16,445, and calculus has a final year cohort of 8392, similar in size to Biology (9038), Chemistry (8183) and Physics (7533).

Some might argue that statistics is already the summit of our curriculum pyramid, but I would see it more as an overly large branch that threatens to unbalance the mathematics tree. I suspect many maths teachers would see it more as a parasite that threatens to suck the life out of their beloved calculus tree. The pyramid needs some reconstruction if we are really to have a statistics-centric curriculum. (Or the tree needs pruning and reshaping – I think I have too many metaphors!)

Statistics-centric curriculum

So, to use a popular phrase, what would a statistics-centric curriculum look like? And what would be the advantages and disadvantages of such a curriculum? I will deal with implementation issues later.

To start with, the base of the pyramid would look little different from the calculus-pinnacled pyramid. In the early years of schooling the emphasis would be on number skills (arithmetic), measurement and other practical and concrete aspects. There would also be a small but increased emphasis on data collection and uncertainty. This is in fact present in the NZ curriculum. Algebra would be introduced, but as a part of the curriculum, rather than the central idea. There would be much more data collection, and probability-based experimentation. Uncertainty would be embraced, rather than ignored.

In the early years of high school, probability and statistics would take a more central place in the curriculum, so that students develop important skills ready for their pinnacle course in the final two years. They would know about the statistical enquiry cycle, how to plan and collect data and write questionnaires.  They would perform their own experiments, preferably in tandem with other curriculum areas such as biology, food-tech or economics. They would understand randomness and modelling. They would be able to make critical comments about reports in the media . They would use computers to create graphs and perform analyses.

As they approach the summit, most students would focus on statistics, while those who were planning to pursue a career in engineering would also take calculus. In the final two years students would be ready to build their own probabilistic models to simulate real-world situations and solve problems. They would analyse real data and write coherent reports. They would truly understand the concept of inference, and why confidence intervals are needed, rather than calculating them by hand or deriving formulas.

There is always a trade-off. Here is my take on the skills developed in each of the curricula.

Calculus-centric curriculum

Statistics-centric curriculum

Logical thinking Communication
Abstract thinking Dealing with uncertainty and ambiguity
Problem-solving Probabilistic models
Modelling (mainly deterministic) Argumentation, deduction
Proof, induction Critical thinking
Plotting deterministic graphs from formulas Reading and creating tables and graphs from data

I actually think you also learn many of the calc-centric skills in the stats-centric curriculum, but I wanted to look even-handed.

Implementation issues

Benjamin suggests, with charming optimism, that the new focus would be “easy to implement and inexpensive.”  I have been a very interested observer in the implementation of the new statistics curriculum in New Zealand. It has not happened easily, being inexpensive has been costly, and there has been fallout. Teachers from other countries (of which there are many in mathematics teaching in NZ) have expressed amazement at how much the NZ teachers accept with only murmurs of complaint. We are a nation with a “can do” attitude, who, by virtue of small population and a one-tier government, can be very flexible. So long as we refrain from following the follies of our big siblings, the UK, US and Australia, NZ has managed to have a world-class education system. And when a new curriculum is implemented, though there is unrest and stress, there is seldom outright rebellion.

In my business, I get the joy of visiting many schools and talking with teachers of mathematics and statistics. I am fascinated by the difference between schools, which is very much a function of the head of mathematics and principal. Some have embraced the changes in focus, and are proactively developing pathways to help all students and teachers to succeed. Others are struggling to accept that statistics has a place in the mathematics curriculum, and put the teachers of statistics into a ghetto where they are punished with excessive marking demands.

The problem is that the curriculum change has been done “on the cheap”. As well as being small and nimble, NZ is not exactly rich. The curriculum change needed more advisors, more release time for teachers to develop and more computer power. These all cost. And then you have the problem of “me too” from other subjects who have had what they feel are similar changes.

And this is not really embracing a full stats-centric curriculum. Primary school teachers need training in probability and statistics if we are really to implement Benjamin’s idea fully. The cost here is much greater as there are so many more primary school teachers. It may well take a generation of students to go through the curriculum and enter back as teachers with an improved understanding.

Computers make it possible

Without computers the only statistical analysis that was possible in the classroom was trivial. Statistics was reduced to mechanistic and boring hand calculation of light-weight statistics and time-filling graph construction. With computers, graphs and analysis can be performed at the click of a mouse, making graphs a tool, rather than an endpoint. With computing power available real data can be used, and real problems can be addressed. High level thinking is needed to make sense and judgements and to avoid wrong conclusions.

Conversely, the computer has made much of calculus superfluous. With programs that can bash their way happily through millions of iterations of a heuristic algorithm, the need for analytic methods is seriously reduced. When even simple apps on an iPad can solve an algebraic equation, and Excel can use “What if” to find solutions, the need for algebra is also questionable.

Efficient citizens

In H.G. Wells’ popular but misquoted words, efficient citizenry calls for the ability to make sense of data. As the science fiction-writer that he was, he foresaw the masses of data that would be collected and available to the great unwashed. The levelling nature of the web has made everyone a potential statistician.

According to the engaging new site from the ASA, “This is statistics”, statisticians make a difference, have fun, satisfy curiosity and make money. And these days they don’t all need to be good at calculus.

Let’s start redesigning our pyramid.

A helpful structure for analysing graphs

Mathematicians teaching English

“I became a maths teacher so I wouldn’t have to mark essays”
“I’m having trouble getting the students to write down their own ideas”
“When I give them templates I feel as if it’s spoon-feeding them”

These are comments I hear as I visit mathematics teachers who are teaching the new statistics curriculum in New Zealand. They have a point. It is difficult for a mathematics teacher to teach in a different style. But – it can also be rewarding and interesting, and you never get asked, “Where is this useful?”

The statistical enquiry cycle provides a structure for all statistical investigations and learning.

We start with a problem or question, and undergo an investigation, either using extant data, an experiment or observational study to answer the question. Writing skills are key in several stages of the cycle. We need to be able to write an investigative question (or hypotheses). We need to write down a plan, and sometimes an entire questionnaire. We need to write down what we find in the analysis and we need to write a conclusion to answer the original question. That’s a whole heap of writing!

And for teachers who may not be all that happy about writing themselves, and students who chose mathematical subjects to avoid writing, it can be a bridge too far.
In previous posts on teaching report writing I promote the use of templates, and give some teaching suggestions.

In this post I am concentrating on analysing graphs, using a handy acronym, OSEM. OSEM was developed by Jeremy Brocklehurst from Lincoln High School near Christchurch NZ. There are other acronyms that would work just as well, but we like this one, not the least for its link with kiwi culture. We think it is awesome (OSEM). You could Google “o for awesome”, to get the background. OSEM stands for Obvious, Specific, Evidence and Meaning. It is a process to follow, rather than a checklist.

I like the use of O for obvious. I think students can be scared to say what they think might be too obvious, and look for tricky things. By including “obvious” in the process, it allows them to write about the important, and usually obvious features of a graph. I also like the emphasis on meaning, Unless the analysis of the data links back to the context and purpose of the investigation, it is merely a mathematical exercise.

Is this spoon-feeding? Far from it. We are giving students a structure that will help them to analyse any graph, including timeseries, scatter plots, and histograms, as well as boxplots and dotplots. It emphasises the use of quantitative information, linked with context. There is nothing revolutionary about it, but I think many statistics teachers may find it helpful as a way to breakdown and demystify the commenting process.

Class use of OSEM

In a class setting, OSEM is a helpful framework for students to work in groups. Students individually (perhaps on personal whiteboards) write down something obvious about the graph. Then they share answers in pairs, and decide which one to carry on with. In the pair they specify and give evidence for their “obvious” statement. Then the pairs form groups of four, and they come up with statements of meaning, that are then shared with the class as a whole.

Spoon feeding has its place

On a side-note – spoon-feeding is a really good way to make sure children get necessary nutrition until they learn to feed themselves. It is preferable to letting them starve before they get the chance to develop sufficient skills and co-ordination to get the food to their mouths independently.

Teach students to learn to fish

There is a common saying that goes roughly, “Give a person a fish and you feed him for a day. Teach a person to fish and you feed her for a lifetime.”

Statistics education is all about teaching people to fish. In a topic on questionnaire design, we choose as our application the consumption of sugar drinks, the latest health evil. We get the students to design questionnaires to find out drinking habits. Clearly we don’t want to focus too much on the sugar drink aspect, as this is the context rather than the point of the learning. What we do want to focus on is the process, so that in future, students can transfer their experience writing a questionnaire about sugar drinks to designing a questionnaire about another topic, such as chocolate, or shoe-buying habits.

Questionnaire design is part of the New Zealand school curriculum, and the process includes a desk-check and a pilot survey. When the students are assessed, they must show the process they have gone through in order to produce the final questionnaire. The process is at least as important as the resulting questionnaire itself.

Here is our latest video, teaching the process of questionnaire design.

Examples help learning

Another important learning tool is the use of examples. When I am writing computer code, I usually search on the web or in the manual for a similar piece of code, and work out how it works and adapt it. When I am trying to make a graphic of something, I look around at other graphics, and see what works for me and what does not. I use what I have learned in developing my own graphics. Similarly when we are teaching questionnaire design, we should have examples of good questionnaires, and not so good questionnaires, so that students can see what they are aiming for. This is especially true for statistical report-writing, where a good example can be very helpful for students to see what is required.

Learning how to learn

But I’d like to take it a step further. Perhaps as well as teaching how to design a questionnaire, or write a report, we should be teaching how to learn how to design a questionnaire. This is a transferable skill to many areas of statistics and probability as well as operations research, mathematics, life… This is teaching people to be “life-long learners”, a popular catchphrase.

We could start the topic by asking, “How would you learn how to design a questionnaire?” then see what the students come up with. If I were trying to learn how to design a questionnaire, I would look at what the process might entail. I would think about the whole statistical process, thinking about similarities and differences. I would think about things that could go wrong in a questionnaire. I would also spend some time on the web, and particularly YouTube, looking at lessons on how to design a questionnaire. I would ask questions. I would look at good questionnaires. I would then try out my process, perhaps on a smaller problem. I would evaluate my process by looking at the end-result. I would think about what worked and what didn’t, and what I would do next time.

This gives us three layers of learning, Our students are learning how to write a questionnaire about sugar drinks, and the output from that is a questionnaire. They are also learning the general process of designing a questionnaire, that can be transferred to other questionnaire contexts. Then at the next level up, they are learning how to learn a process, in this case the process of designing a questionnaire. This skill can be transferred to learning other skills or processes, such as writing a time series report, or setting up an experiment or critiquing a statistical report.

Levels of learning in the statistics classroom

Levels of learning in the statistics classroom

I suspect that the top layer of learning how to learn is often neglected, but is a necessary skill for success at higher learning. We are keen as teachers to make sure that students have all the materials and experiences they need in order to learn processes and concepts. Maybe we need to think a bit more about giving students more opportunities to be consciously learning how to learn new processes and concepts.

We can liken it a little to learning history. When a class studies a certain period in history, there are important concepts and processes that they are also learning, as well as the specifics of that topic. In reality the topic is pretty much arbitrary, as it is the tool by which the students learn history skills, such as critical thinking, comparing, drawing parallels and summarising. In statistics the context, though hopefully interesting, is seldom important in itself. What matters is the concepts, skills and attitudes the student develops through the analysis. The higher level in history might be to learn how to learn about a new philosophical approach, whereas the higher level in statistics is learning how to learn a process.

The materials we provide at Statistics Learning Centre are mainly fishing lessons, with some examples of good and bad fish.  It would be great if we could also use them to develop students’ ability to learn new things, as well as to do statistics. Something to work towards!

Those who can, teach statistics

The phrase I despise more than any in popular use (and believe me there are many contenders) is “Those who can, do, and those who can’t, teach.” I like many of the sayings of George Bernard Shaw, but this one is dismissive, and ignorant and born of jealousy. To me, the ability to teach something is a step higher than being able to do it. The PhD, the highest qualification in academia, is a doctorate. The word “doctor” comes from the Latin word for teacher.

Teaching is a noble profession, on which all other noble professions rest. Teachers are generally motivated by altruism, and often go well beyond the requirements of their job-description to help students. Teachers are derided for their lack of importance, and the easiness of their job. Yet at the same time teachers are expected to undo the ills of society. Everyone “knows” what teachers should do better. Teachers are judged on their output, as if they were the only factor in the mix. Yet how many people really believe their success or failure is due only to the efforts of their teacher?

For some people, teaching comes naturally. But even then, there is the need for pedagogical content knowledge. Teaching is not a generic skill that transfers seamlessly between disciplines. You must be a thinker to be a good teacher. It is not enough to perpetuate the methods you were taught with. Reflection is a necessary part of developing as a teacher. I wrote in an earlier post, “You’re teaching it wrong”, about the process of reflection. Teachers need to know their material, and keep up-to-date with ways of teaching it. They need to be aware of ways that students will have difficulties. Teachers, by sharing ideas and research, can be part of a communal endeavour to increase both content knowledge and pedagogical content knowledge.

There is a difference between being an explainer and being a teacher. Sal Khan, maker of the Khan Academy videos, is a very good explainer. Consequently many students who view the videos are happy that elements of maths and physics that they couldn’t do, have been explained in such a way that they can solve homework problems. This is great. Explaining is an important element in teaching. My own videos aim to explain in such a way that students make sense of difficult concepts, though some videos also illustrate procedure.

Teaching is much more than explaining. Teaching includes awakening a desire to learn and providing the experiences that will help a student to learn.  In these days of ever-expanding knowledge, a content-driven approach to learning and teaching will not serve our citizens well in the long run. Students need to be empowered to seek learning, to criticize, to integrate their knowledge with their life experiences. Learning should be a transformative experience. For this to take place, the teachers need to employ a variety of learner-focussed approaches, as well as explaining.

It cracks me up, the way sugary cereals are advertised as “part of a healthy breakfast”. It isn’t exactly lying, but the healthy breakfast would do pretty well without the sugar-filled cereal. Explanations really are part of a good learning experience, but need to be complemented by discussion, participation, practice and critique.  Explanations are like porridge – healthy, but not a complete breakfast on their own.

Why statistics is so hard to teach

“I’m taking statistics in college next year, and I can’t wait!” said nobody ever!

Not many people actually want to study statistics. Fortunately many people have no choice but to study statistics, as they need it. How much nicer it would be to think that people were studying your subject because they wanted to, rather than because it is necessary for psychology/medicine/biology etc.

In New Zealand, with the changed school curriculum that gives greater focus to statistics, there is a possibility that one day students will be excited to study stats. I am impressed at the way so many teachers have embraced the changed curriculum, despite limited resources, and late changes to assessment specifications. In a few years as teachers become more familiar with and start to specialise in statistics, the change will really take hold, and the rest of the world will watch in awe.

In the meantime, though, let us look at why statistics is difficult to teach.

  1. Students generally take statistics out of necessity.
  2. Statistics is a mixture of quantitative and communication skills.
  3. It is not clear which are right and wrong answers.
  4. Statistical terminology is both vague and specific.
  5. It is difficult to get good resources, using real data in meaningful contexts.
  6. One of the basic procedures, hypothesis testing, is counter-intuitive.
  7. Because the teaching of statistics is comparatively recent, there is little developed pedagogical content knowledge. (Though this is growing)
  8. Technology is forever advancing, requiring regular updating of materials and teaching approaches.

On the other hand, statistics is also a fantastic subject to teach.

  1. Statistics is immediately applicable to life.
  2. It links in with interesting and diverse contexts, including subjects students themselves take.
  3. Studying statistics enables class discussion and debate.
  4. Statistics is necessary and does good.
  5. The study of data and chance can change the way people see the world.
  6. Technlogical advances have put the power for real statistical analysis into the hands of students.
  7. Because the teaching of statistics is new, individuals can make a difference in the way statistics is viewed and taught.

I love to teach. These days many of my students are scattered over the world, watching my videos (for free) on YouTube. It warms my heart when they thank me for making something clear, that had been confusing. I realise that my efforts are small compared to what their teacher is doing, but it is great to be a part of it.

How to learn statistics (Part 2)

Some more help (preaching?) for students of statistics

Last week I outlined the first five principles to help people to learn and study statistics.

They focussed on how you need to practise in order to be good at statistics and you should not wait until you understand it completely before you start applying. I sometimes call this suspending disbelief. Next I talked about the importance of context in a statistical investigation, which is one of the ways that statistics is different from pure mathematics. And finally I stressed the importance of technology as a tool, not only for doing the analysis, but for exploring ideas and gaining understanding.

Here are the next five principles (plus 2):

6. Terminology is important and at times inconsistent

There are several issues with regard to statistical terminology, and I have written a post with ideas for teachers on how to teach terminology.

One issue with terminology is that some words that are used in the study of statistics have meanings in everyday life that are not the same. A clear example of this is the word, “significant”. In regular usage this can mean important or relevant, yet in statistics, it means that there is evidence that an effect that shows up in the sample also exists in the population.

Another issue is that statistics is a relatively young science and there are inconsistencies in terminology. We just have to live with that. Depending on the discipline in which the statistical analysis is applied or studied, different terms can mean the same thing, or very close to it.

A third language problem is that mixed in with the ambiguity of results, and judgment calls, there are some things that are definitely wrong. Teachers and examiners can be extremely picky. In this case I would suggest memorising the correct or accepted terminology for confidence intervals and hypothesis tests. For example I am very fussy about the explanation for the R-squared value in regression. Too often I hear that it says how much of the dependent variable is explained by the independent variable. There needs to be the word “variation” inserted in there to make it acceptable. I encourage my students to memorise a format for writing up such things. This does not substitute for understanding, but the language required is precise, so having a specific way to write it is fine.

This problem with terminology can be quite frustrating, but I think it helps to have it out in the open. Think of it as learning a new language, which is often the case in new subject. Use glossaries, to make sure you really do know what a term means.

7. Discussion is important

This is linked with the issue of language and vocabulary. One way to really learn something is to talk about it with someone else and even to try and teach it to someone else. Most teachers realise that the reason they know something pretty well, is because they have had to teach it. If your class does not include group work, set up your own study group. Talk about the principles as well as the analysis and context, and try to use the language of statistics. Working on assignments together is usually fine, so long as you write them up individually, or according to the assessment requirements.

8. Written communication skills are important

Mathematics has often been a subject of choice for students who are not fluent in English. They can perform well because there is little writing involved in a traditional mathematics course. Statistics is a different matter, though, as all students should be writing reports. This can be difficult at the start, but as students learn to follow a structure, it can be made more palatable. A statistics report is not a work of creative writing, and it is okay to use the same sentence structure more than once. Neither is a statistics report a narrative of what you did to get to the results. Generous use of headings makes a statistical report easier to read and to write. A long report is not better than a short report, if all the relevant details are there.

9. Statistics has an ethical and moral aspect

This principle is interesting, as many teachers of statistics come from a mathematical background, and so have not had exposure to the ethical aspects of research themselves. That is no excuse for students to park their ethics at the door of the classroom. I will be pushing for more consideration of ethical aspects of research as part of the curriculum in New Zealand. Students should not be doing experiments on human subjects that involve delicate subjects such as abuse, or bullying. They should not involve alcohol or other harmful substances. They should be aware of the potential to do harm, and make sure that any participants have been given full information and given consent. This can be quite a hurdle, but is part of being an ethical human being. It also helps students to be more aware when giving or withholding consent in medical and other studies.

10. The study of statistics can change the way you view the world

Sometimes when we learn something at school, it stays at school and has no impact on our everyday lives. This should not be the case with the study of statistics. As we learn about uncertainty and variation we start to see this in the world around us. When we learn about sampling and non-sampling errors, we become more critical of opinion polls and other research reported in the media. As we discover the power of statistical analysis and experimentation, we start to see the importance of evidence-based practice in medicine, social interventions and the like.

11. Statistics is an inherently interesting and relevant subject.

And it can be so much fun. There is a real excitement in exploring data, and becoming a detective. If you aren’t having fun, you aren’t doing it right!

12. Resources from Statistics Learning Centre will help you learn.

Of course!

Statistics is not beautiful (sniff)

Statistics is not really elegant or even fun in the way that a mathematics puzzle can be. But statistics is necessary, and enormously rewarding. I like to think that we use statistical methods and principles to extract truth from data.

This week many of the high school maths teachers in New Zealand were exhorted to take part in a Stanford MOOC about teaching mathematics. I am not a high school maths teacher, but I do try to provide worthwhile materials for them, so I thought I would take a look. It is also an opportunity to look at how people with an annual budget of more than 4 figures produce on-line learning materials. So I enrolled and did the first lesson, which is about people’s attitudes to math(s) and their success or trauma that has led to those attitudes. I’m happy to say that none of this was new to me. I am rather unhappy that it would be new to anyone! Surely all maths teachers know by now that how we deal with students’ small successes and failures in mathematics will create future attitudes leading to further success or failure. If they don’t, they need to take this course. And that makes me happy – that there is such a course, on-line and free for all maths teachers. (As a side note, I loved that Jo, the teacher switched between the American “math” and the British/Australian/NZ “maths”).

I’ve only done the first lesson so far, and intend to do some more, but it seems to be much more about mathematics than statistics, and I am not sure how relevant it will be. And that makes me a bit sad again. (It was an emotional journey!)

Mathematics in its pure form is about thinking. It is problem solving and it can be elegant and so much fun. It is a language that transcends nationality. (Though I have always thought the Greeks get a rough deal as we steal all their letters for the scary stuff.) I was recently asked to present an enrichment lesson to a class of “gifted and talented” students. I found it very easy to think of something mathematical to do – we are going to work around our Rogo puzzle, which has some fantastic mathematical learning opportunities. But thinking up something short and engaging and realistic in the statistics realm is much harder. You can’t do real statistics quickly.

On my run this morning I thought a whole lot more about this mathematics/statistics divide. I have written about it before, but more in defense of statistics, and warning the mathematics teachers to stay away or get with the programme. Understanding commonalities and differences can help us teach better. Mathematics is pure and elegant, and borders on art. It is the purest science. There is little beautiful about statistics. Even the graphs are ugly, with their scattered data and annoying outliers messing it all up. The only way we get symmetry is by assuming away all the badly behaved bits. Probability can be a bit more elegant, but with that we are creeping into the mathematical camp.

English Language and English literature

I like to liken. I’m going to liken maths and stats to English language and English literature. I was good at English at school, and loved the spelling and grammar aspects especially. I have in my library a very large book about the English language, (The Cambridge encyclopedia of the English Language, by David Crystal) and one day I hope to read it all. It talks about sounds and letters, words, grammar, syntax, origins, meanings. Even to dip into, it is fascinating. On the other hand I have recently finished reading “The End of Your Life Book Club” by Will Schwalbe, which is a biography of his amazing mother, set around the last two years of her life as she struggles with cancer. Will and his mother are avid readers, and use her time in treatment to talk about books. This book has been an epiphany for me. I had forgotten how books can change your way of thinking, and how important fiction is. At school I struggled with the literature side of English, as I wanted to know what the author meant, and could not see how it was right to take my own meaning from a book, poem or work of literature.  I have since discovered post-modernism and am happy drawing my own meaning.

So what does this all have to do with maths and statistics? Well I liken maths to English language. In order to be good at English you need to be able to read and write in a functional way. You need to know the mechanisms. You need to be able to DO, not just observe. In mathematics, you need to be able to approach a problem in a mathematical way.  Conversely, to be proficient in literature, you do not need to be able to produce literature. You need to be able to read literature with a critical mind, and appreciate the ideas, the words, the structure. You do need to be able to write enough to express your critique, but that is a different matter from writing a novel.  This, to me is like being statistically literate – you can read a statistical report, and ask the right questions. You can make sense of it, and not be at the mercy of poorly executed or mendacious research. You can even write a summary or a critique of a statistical analysis. But you do not need to be able to perform the actual analysis yourself, nor do you need to know the exact mathematical theory underlying it.

Statistical Literacy?

Maybe there is a problem with the term “statistical literacy”. The traditional meaning of literacy includes being able to read and write – to consume and to produce – to take meaning and to create meaning. I’m not convinced that what is called statistical literacy is the same.

Where I’m heading with this, is that statistics is a way to win back the mathematically disenfranchised. If I were teaching statistics to a high school class I would spend some time talking about what statistics involves and how it overlaps with, but is not mathematics. I would explain that even people who have had difficulty in the past with mathematics, can do well at statistics.

The following table outlines the different emphasis of the two disciplines.

Mathematics Statistics
Proficiency with numbers is important Proficiency with numbers is helpful
Abstract ideas are important Concrete applications are important
Context is to be removed so that we can model the underlying ideas Context is crucial to all statistical analysis
You don’t need to write very much. Written expression in English is important

Another idea related to this is that of “magic formulas” or the cookbook approach. I don’t have a problem with cookbooks and knitting patterns. They help me to make things I could not otherwise. However, the more I use recipes and patterns, the more I understand the principles on which they are based. But this is a thought for another day.

The Knife-edge of Competence

I do my own video-editing using a very versatile and complex program called Adobe Premiere Pro. I have had no formal training, and get help by ringing my son, who taught me all I know and can usually rescue me with patient instructions over the phone. At times, especially in the early stages I have felt myself wobbling along the knife-edge of competence. All I needed was for something new to go wrong, or or click a button inadvertently and I would fall off the knife-edge and the whole project would disappear into a mass of binary. This was not without good reason. Premiere Pro wasn’t always stable on our computer, and at one point it took us several weeks to get our hard-drive replaced. (Apple “Time machine” saved me from despair). And sometimes I would forget to save regularly and a morning’s work was lost. (Even time-machine can’t help with that level of incompetence.)

But despite my severe limitations I have managed to edit over twenty videos that now receive due attention (and at times adulation!) on YouTube. It isn’t an easy feeling, to be teetering on the brink of disaster, real or imagined. But there was no alternative, and there is a sense of pride at having made it through with only a few scars and not too much inappropriate language.

There are some things at which I feel totally competent. I can speak to a crowd of any number of people and feel happy that they will be entertained, edified and perhaps even educated. I can analyse data using basic statistical methods. I can teach a person about inference. Performing these tasks is a joy, because I know I have the prerequisite skills and knowledge to cope with whatever happens. But on the way to getting to this point, I had to walk the knife-edge of competence.

Many teachers of statistics know too well this knife-edge. In New Zealand at present there are a large number of teachers of Year 13 statistics who are teaching about bootstrapping, when their own understanding of it is sketchy. They are teaching how to write statistical reports, when they have never written one themselves. They are assessing statements about statistics that they are not actually sure about. This is a knife-edge. They feel that any minute a student will ask them a question about the content that they cannot answer. These are not beginning teachers, but teachers with years and decades of experience in teaching mathematics and mathematical statistics. But the innovations of the curriculum have put them in an uncomfortable position. Inconsistent, tardy and even incorrect information from the qualification agency is not helping, but that is a story for another day.

In another arena there are professors and lecturers of statistics (in the antipodes we do not throw around the title “professor” with the abandon of our North American cousins) who are extremely competent at statistical mathematics and analysis but who struggle to teach in a satisfactory way. Their knife-edge concerns teaching, appropriate explanation and the generation of effective learning activities and assessments in the absence of any educational training. They fear that someone will realise one day that they don’t really know how to devise learning objectives, and provide fair assessments. I am hoping that this blog is going some way to helping these people to ask for help! Unfortunately the frequent response is avoidance behaviour, which is alarmingly supported by a system that rewards research publications rather than effective educational endeavours.

So what do you do when you are walking the knife-edge of competence?

You do the best you can.

And sometimes you fake it.

I am led to believe there is a gender-divide on this. Some people are better at hiding their incompetence than others, and just about all the people I know like that are men. I had a classmate in my honours year who was at a similar level of competence to me, but he applied for jobs I wouldn’t have contemplated. The fear of being shown up as a fake, or not knowing EXACTLY what to do at any point stopped me from venturing. He horrified me further a few years later when he set up his own company. Nearly three decades, two children and a PhD later I am not so fastidious or “nice” in the Jane Austen meaning of the word. If I think I can probably learn how to do something in time to make a reasonable fist of it and not cause actual harm, I’m likely to have a go. Hence taking my redundancy and running!

When I first lectured in statistics for management,  I did not know much beyond what I was teaching. I lived in fear that someone would ask me a question that I couldn’t answer and I would be revealed as the fake I was. Well you know, it never happened! I even taught students who were statistics majors, who did know more than I, and post-graduate students in psychology and heads of mathematics departments, and my fears were never realised. In fact the stats students told me that they finally understood the central limit theorem, thanks to my nifty little exercise using dotplots on minitab. (Which was how I had finally understood the central limit theorem – or at least the guts of it.)

I’m guessing that this is probably true for most of the mathematics teachers who are worrying. Despite their fear, they have not been challenged or called out.

The teachers’ other unease is the feeling that they are not giving the best service to their students, and the students will suffer, miss out on scholarships, decide not to get a higher education and live their lives on the street.  I may be exaggerating a little here, but certainly few of us like to give a service that is less than what we are accustomed to. We feel bad when we do something that feels substandard.

There are two things I learned in my twenty years of lecturing that may help here:

We don’t know how students perceive what we do. Every now and again I would come out of a lecture with sweat trickling down my spine because something had gone wrong. It might be that in the middle of an explanation I had had second thoughts about it, changed tack, then realised I was right in the first-place and ended up confusing myself. Or perhaps part way through a worked example it was pointed out to me that there was a numerical error in line three. To me these were bad, bad things to happen. They undermined my sense of competence. But you know, the students seldom even noticed. What felt like the worst lecture of my life, was in fact still just fine.

The other thing I learned is that we flatter ourselves when we think how much difference our knowledge may make.  Now don’t get me wrong here – teachers make an enormous difference. People who become teachers do so because we want to help people. We want to make a difference in students’ lives. We often have a sense of calling. There may be some teachers who do it because they don’t know what else to do with their degree, but I like to think that most of us teachers teach because to not teach is unthinkable. I despise, to the point of spitting as I talk, the expression “Those who can, do, and those who can’t, teach.” One day when the mood takes me I will write a whole post about the noble art of teaching and the fallacy of that dismissive statement. My next statement is so important I will give it a paragraph of its own.

A teacher who teaches from love, who truly cares about what happens to their students, even if they are struggling on the knife-edge of competence will not ruin their students’ lives through temporary incompetence in an aspect of the curriculum.

There are many ways that a teacher can have devastating effects on their students, but being, for a short time, on the knife-edge of competence, is not one of them.

Take heart, keep calm and carry on!

Teaching statistical report-writing

Teaching how to write statistical reports

It is difficult to write statistical reports and it is difficult to teach how to write statistical reports.

When statistics is taught in the traditional way, with emphasis on the underlying mathematics the process of statistics is truncated at both ends. When we concentrate on the sterile analysis, the messy “writing stuff” is avoided. Students do not devise their own investigative questions, and they do not write up the results.

Here’s the thing though – in reality, the analysis step of a statistical investigation is a very small part of the whole, and performed at the click of a button or two.

Ultimately the embedding of the analysis back into an investigation should not be a problem. The really interesting part of statistics happens all around the analysis. Understanding the context enriches the learning, transforming the discipline from mathematics to statistics. We can help students embrace the excitement of a true statistical investiation. But in this time of transition, the report-writing aspects are a problem. They are a problem for the learner and for the teacher.

The new New Zealand curriculum for statistics requires report-writing as an essential component of the majority of assessment, particularly at the final year of high school. This is causing understandable concern among teachers, who come predominantly from a mathematical background. I can imagine myself a few years ago saying. “I became a maths teacher so I wouldn’t have to teach and mark essays!” In addition the results from the students are less than stellar, even from capable students. Teachers do not like their students to perform poorly.

All statistics courses should have a component of report-writing, unless they are courses in the mathematics of statistics. The problem here is, like the secondary school teachers in New Zealand, many statistics instructors are dealing with the mathematics more than the application of statistics, and are not confident of their own ability at report-writing themselves. Normal human behaviour is to avoid it. Having taught service statistics courses in a business school for two decades, I have gradually made the transition to more emphasis on report-writing and am convinced that statistical report-writing needs to be taught explicitly, and taught well.

Report-writing is a fundamental and useful skill

For teachers who are uncomfortable with teaching and marking reports, it would be nice to dismiss the process of report-writing  as “not important”. Much of statistics teaching is in a service course, as discussed in my previous blog. It is unlikely that any of these students will ever have to write a report on a statistical analysis, other than as part of the assessment for the course.  So why do we put them and ourselves through this?

You don’t realise whether you understand or not until you try to write it down.

The written word requires a higher level of precision than a thought or a spoken explanation. Your sentences look at you from the page and mock you with their vagueness and ambiguity. I find this out time and again as I blog. What seems like a well thought out argument in my head as I do my morning run, falls to shreds on paper, before being mustered into some semblance of order. It is in writing that we identify the flaws in our understanding. As we try to write our findings we become more aware of fuzzy thinking and gaps in reasoning. As we write we are required to organise our thoughts.

Better critics of other reports

A student who has been required to produce a report of a good standard will be exposed to examples of good and bad reports and will be better able to identify incorrect thinking in reports they read themselves. This is perhaps the most important purpose of a terminal course in statistics. Having said that, it is both heart-warming and alarming to hear from past-students the wonderful things they are doing with the statistics they learned in my one-semester course.

Useful skill for employment

Students need to be able to read and write as part of empowered citizenship. The skill of writing a coherent report in good English is highly sought after by employers, and of great use at university in just about every discipline. It is a transferable skill to many endeavours.

Reports are needed for assessment

On a practical level, if the teacher is going to evaluate understanding they need evidence to work from. A written report provides one form of evidence of understanding.

Report-writing is difficult to teach

Some maths teachers may feel inadequate in teaching “English”, as they see report-writing. They do not have the pedagogical content knowledge in teaching writing that they do for teaching algebra or percentages, for instance. Pedagogical content knowledge is more than the intersection of knowing a subject, and being able to teach in a general sort of way. It is the knowledge of how to teach a certain discipline, what is difficult to learners, and how to help them learn.

Some basic ideas for teaching report-writing

To write at good report you need to understand what is going on, have the appropriate vocabulary, and use a clear structure. Good teaching will emphasise understanding. Getting students to write sentences about output, and sharing them with their peers is a great way to identify misunderstandings. As these sentences are shared, the teacher can model the use of correct technical language. They can say, for instance, “You have the essence correct here, but there are some more precise terms you could use, such as …” Teachers can either give students outlines for reports, or they can give them several good reports and get the students to identify the underlying structure. I am a firm believer in the generous use of headings within a report. They provide signposts for writer and reader alike.

Report-writing requires practice. The assessment report should not be the first report of that type that a student writes. In the world of motivated students with no other demands on their time, it would be great to have them write up one assignment for the practice and then learn from that to produce a better one. I am aware that students tend not to do the work unless there is a grade attached to it, so it can be difficult to get a student to do a “practice report” ahead of the “real assessment.”  There are other alternatives that approximate this, however, which require less input from the teacher. One of these, the use of templates, is explained in an earlier post, Templates for statistical reports – spoon-feeding?

There is nothing wrong with using templates and “sensible sentences”. (not to be confused with “sensible sentencing”, which seems devoid of sense.) There are only so many ways to say that “the median number of pairs of shoes owned by women is ten.” It is also a difficult sentence to make sound elegant. Good reports will look similar. This is not creative-writing – it is report-writing. Sure the marking may be boring when all the reports seem very similar, but it is a small price to pay when you avoid banging your head against the desk at the bizarre and disorganised offerings.

This is but a musing on the teaching of report-writing. Glenda Francis, in  “An approach to report writing in statistics courses” identifies similar issues, and provides a fuller background to the problem. She also indicates that there is much to be done in developing this area of teaching and research. I will be providing professional development in this area over the next month to at least three groups of teachers, and I look forward to learning a great deal from them, as we explore these issues together.