# Videos for teaching and learning statistics

It delights me that several of my statistics videos have been viewed over half a million times each. As well there is a stream of lovely comments (with the odd weird one) from happy viewers, who have found in the videos an answer to their problems.

In this post I will outline the main videos available on the Statistics Learning Centre YouTube Channel. They already belong to 24,000 playlists and lists of recommended resources in textbooks the world over. We are happy for teachers and learners to continue to link to them. Having them all in one place should make it easier for instructors to decide which ones to use in their courses.

# Philosophy of the videos

Early on in my video production I wrote a series of blog posts about the videos. One was Effective multimedia teaching videos. The videos use graphics and audio to increase understanding and retention, and are mostly aimed at conceptual understanding rather than procedural understanding.

I also wrote a critique of Khan Academy videos, explaining why I felt they should be improved. Not surprisingly this ruffled a few feathers and remains my most commented on post. I would be thrilled if Khan had lifted his game, but I fear this is not the case. The Khan Academy pie chart video still uses an unacceptable example with too many and ordered categories. (January 2018)

Before setting out to make videos about confidence intervals, I critiqued the existing offerings in this post. At the time the videos were all about how to find a confidence interval, and not what it does. I suspect that may be why my video, Understanding Confidence Intervals, remains popular.

# Introducing statistics

## Understanding Summary Statistics 5:14 minutes

Why we need summary statistics and what each of them does. It is not about how to calculate the statistics, but what they mean. It uses the shoe example, which also appears in the PPDAC and OSEM videos.

## Understanding Graphs 6:06 minutes

I briefly explains the use and interpretation of seven different types of statistical graph. They include the pictogram, bar chart, pie chart, dot plot, stem and leaf, scatterplot and time series.

## Analysing and commenting on Graphical output using OSEM 7:13 minutes

This video teaches how to comment on graphs and other statistical output by using the acronym OSEM. It is especially useful for students in NCEA statistics classes in New Zealand, but many people everywhere can find OSEM awesome! We use the example of comparing the number of pairs of shoes men and women students say they own.

## Variation and Sampling error 6:30 minutes

Statistical methods are necessary because of the existence of variation. Sampling error is one source of variation, and is often misunderstood. This video explains sampling error, along with natural variation, explainable variation and variation due to bias. There is an accompanying video on non-sampling error.

## Sampling methods 4:54 minutes 500,000 views

This video describes five common methods of sampling in data collection – simple random, convenience, systematic, cluster and stratified. Each method has a helpful symbolic representation.

## Types of data 6:20 minutes 600,000 views

The kind of graph and analysis we can do with specific data is related to the type of data it is. In this video we explain the different levels of data, with examples. This video is particularly popular at the start of courses.

## Important Statistical concepts 5:34 minutes 50,000 views

This video does not receive the views it deserves, as it covers three really important ideas. Maybe I should split it up into three videos. The ideas are the difference between significance and usefulness, evidence and strength of effect, causation and association.

Other videos complementary to these, but not on YouTube are:

• The statistical enquiry process
• Understanding the Box Plot
• Non-sampling error

# Videos for teaching hypothesis testing

## Understanding Statistical inference 6:46 minutes 40,000 views

The most difficult concept in statistics is that of inference. This video explains what statistical inference is and gives memorable examples. It is based on research around three concepts pivotal to inference – that the sample is likely to be a good representation of the population, that there is an element of uncertainty as to how well the sample represents the population, and that the way the sample is taken matters.

## Understanding the p-value 4:43 minutes 500,000 views

This video explains how to use the p-value to draw conclusions from statistical output. It includes the story of Helen, making sure that the choconutties she sells have sufficient peanuts. It introduces the helpful phrase “p is low, null must go”.

## Inference and evidence 3:34 minutes

This is a newer video, based on a little example I used in lectures to help students see the link between evidence and inference. Of course it involves chocolate.

## Hypothesis tests 7:38 minutes 350,000 views

This entertaining video works step-by-step through a hypothesis test. Helen wishes to know whether giving away free stickers will increase her chocolate sales. This video develops the ideas from “Understanding the p-value”, giving more of the process of hypothesis testing. It is also complemented by the following video, that shows how to perform the analysis using Excel.

## Two-means t-test in Excel 3:54 minutes 50,000 views

A step-by-step lesson on how to perform an independent samples t-test for difference of two means using the Data Analysis ToolPak in Excel. This is a companion video to Hypothesis tests, p-value, two means t-test.

## Choosing which statistical test to use 9:33 minutes 500,000 views

I am particularly proud of this video, and the way it links the different tests together. It took a lot of work to come up with this. First it outlines a process for thinking about the data, the sample and the thing you are trying to find out. Then it works through seven tests with scenarios based around Helen and the Choconutties. This video is particularly popular near the end of the semester, for tying together the different tests and applications.

# Confidence Intervals

## Understanding Confidence Intervals 4:02 minutes 500,000 views

This short video gives an explanation of the concept of confidence intervals, with helpful diagrams and examples. The emphasis is on what a confidence interval is and how it is used, rather than how they are calculated or derived.

## Calculating the confidence interval for a mean using a formula 5:29 minutes 200,000 views

This video carries on from “Understanding Confidence Intervals” and introduces a formula for calculating a confidence interval for a mean. It uses graphics and animation to help understanding.

There are also videos pertinent to the New Zealand curriculum using bootstrapping and informal methods to find confidence intervals.

# Probability

## Introduction to Probability 2:54 minutes

This video explains what probability is and why we use it. It does NOT use dice, coins or balls in urns. It is the first in a series of six videos introducing basic probability with a conceptual approach. The other five videos can be accessed through subscription.

## Understanding Random Variables 5:08 minutes 90,000 views

The idea of a random variable can be surprisingly difficult. In this video we help you learn what a random variable is, and the difference between discrete and continuous random variables. It uses the example of Luke and his ice cream stand.

## Understanding the Normal Distribution 7:44 minutes

In this video we explain the characteristics of the normal distribution, and why it is so useful as a model for real-life entities.

There are also two other videos about random variables, discrete and continuous.

## Risk and Screening 7:54 minutes

This video explains about risk and screening, and shows how to calculate and express rates of false positives and false negatives. An imaginary disease, “Earpox” is used for the examples.

# Other videos

## Designing a Questionnaire 5:23 minutes 40,000 views

This was written specifically to support learning in Level 1 NCEA in the NZ school system but is relevant for anyone needing to design a questionnaire. There is a companion video on good and bad questions.

# Line-fitting and regression

## Scatterplots in Excel 5:17 minutes

The first step in doing a regression in Excel is to fit the line using a Scatter plot. This video shows how to do this, illustrated by the story of Helen and the effect of temperature on her sales of choconutties

## Regression in Excel 6:27 minutes

This video explains Regression and how to perform regression in Excel and interpret the output. The story of Helen and her choconutties continues. This follows on from Scatterplots in Excel and Understanding the p-value.

There are three videos introducing bivariate relationships in a more conceptual way.

There are also videos covering experimental design and randomisation, time series analysis and networks. In the pipeline is a video “understanding the Central Limit Theorem.”

# Supporting our endeavours

As explained in a previous post, Lessons for a budding Social Enterprise, Statistics Learning Centre is a social enterprise, with our aim to build a world of mathematicians and enable people to make intelligent use of statistics. Though we get some income from YouTube videos, it does not support the development of more videos. If you would like to help us to create further videos contact us to discuss subscriptions, sponsorship, donations and advertising possibilities. info@statsLC.com or n.petty@statsLC.com.

# Graphs – beauty and truth (with apologies to Keats)

## A good graph is elegant

I really like graphs. I like the way graphs turn numbers into pictures. A good graph is elegant. It uses a few well-placed lines to communicate what would take a paragraph of text. And like a good piece of literature or art, a good graph continues to give, beyond the first reading. I love looking at my YouTube and WordPress graphs. These graphs tell me stories. The WordPress analytics tell me that when I put up a new post, I get more hits, but that everyday more than 1000 people read one of my posts. The YouTube analytics tell me stories about when people want to know about different aspects of statistics. It is currently the end of the North American school year, and the demand is for my video on Choosing which statistical test to use. Earlier in the year, the video about levels of measurement is the most popular. And not many people view videos about statistics on the 25th of December. I’m happy to report that the YouTube and WordPress graphs are good graphs.

Spreadsheets have made it possible for anyone and everyone to create graphs. I like that graphs are easier to make. Drawing graphs by hand is a laborious task and fraught with error. But sometimes my heart aches when I see a graph used badly. I suspect that this is when a graphic artist has taken control, and the search for beauty has over-ridden the need for truth.

Three graphs spurred me to write this post.

## Graph One: Bad-tasting Donut on house occupation

The first was on a website to find out about property values. I must have clicked onto something to find out about the property values in my area, and was taken to the qv website. And this is the graph that disturbed me.

Graphs named after food are seldom a good idea

Sure it is pretty – uses pretty colours and shading, and you can find out what it is saying by looking at the key – with the numbers beside it. But a pie or donut chart should not be used for data which has inherent order. The result here is that the segments are not in order. Or rather they are ordered from most frequent to least frequent, which is not intuitive. Ordinal data is best represented in a bar or column chart. To be honest, most data is best represented in a bar or column chart. My significant other suggested that bar charts aren’t as attractive as pie charts. Circles are prettier than rectangles. Circles are curvy and seem friendlier than straight lines and rectangles. So prettiness has triumphed over truth.

## Graph Two: Misleading pictogram (a tautology?)

It may be a little strong to call bad communication lack of truth. Let’s look at another example. In a way it is cheating to cite a pictogram in a post like this. Pictograms are the lowest form of graph and are so often incorrect, that finding a bad one is easier than finding a good one. In the graph below of fatalities it is difficult to work out what one little person represents.

What does one little person represent?

A quick glance, ignoring the numbers, suggests that the road toll in 2014 is just over half what it was in 2012. However, the truth, calculated from the numbers, is that the relative size is 80%. 2012 has 12 people icons, representing 280 fatalities. One icon is removed for 2013, representing a drop of 9 fatalities. 2011 has one icon fewer again, representing a drop of 2 fatalities. There is so much wrong in the reporting of road fatalities, that I will stop here. Perhaps another day…

## Graph Three: Mysterious display on Household income

And here is the other graph that perplexed me for some time. It came in the Saturday morning magazine from our newspaper, as part of an article about inequality in New Zealand. Anyone who reads my blog will be aware that my politics place me well left of centre, and I find inequality one of the great ills of the modern day. So I was keen to see what this graph would tell me. And the answer is…

See how long it takes for you to find where you appear on the graph. (Pretending you live in NZ)

I have no idea. Now, I have expertise in the promulgation of statistics, and this graph stumped me for some time. Take a good look now, before I carry on.

Graphs are the main way that statistical analysts communicate with the outside world. Graphs like these ones do us no favours, even if they are not our fault. We need to do better, and make sure that all students learn about graphs.

## Teaching suggestion – a graph a day

Here is a suggestion for teachers at all levels. Have a “graph a day” display – maybe for a month? Students can contribute graphs from the news media. Each day discuss what the graph is saying, and critique the way the graph is communicating. I have a helpful structure for reading graphs in my post: There’s more to reading graphs than meets the eye;

Here is a summary of what I’ve said and what else I could say on the topic.

• The choice of graph depends on the purpose
• The text should state the purpose of the graph
• There is not a graph for everything you wish to communicate
• Sometimes a table communicates better than a graph
• Graphs are part of the analysis as well as part of the reporting. But some graphs are better to stay hidden.
• If it takes more than a few seconds to work out what a graph is communicating it should either be dumped or have an explanation in the text
• Truth (or communication) is more important than beauty
• There is beauty in simplicity
• Be aware than many people are colour-blind, or cannot easily differentiate between different shades.

## Feedback from previous post on which graph to use

Late last year I posted four graphs of the same data and asked for people’s opinions. You can link back to the post here and see the responses: Which Graph to Use.

The interesting thing is not which graph was selected as the most popular, but rather that each graph had a considerable number of votes. My response is that it depends.  It depends on the question you are answering or the message you are sending. But yes – I agree with the crowd that Graph A is the one that best communicates the various pieces of information. I think it would be improved by ordering the categories differently. It is not very pretty, but it communicates.

I recently posted a new video on YouTube about graphs. It is a quick once-over of important types of graphs, and can help to clarify what they are about. There are examples of good graphs in there.

I have written about graphs previously and you can find them here on the Collected Works page.

I’m interested in your thoughts. And I’d love to see some beautiful and truthful graphs in the comments.

# The role of play in learning

I have been reading further about teaching mathematics and came across this interesting assertion:

Play, understood as something frivolous, opposed to work, off-task behaviour, is not welcomed into most mathematics classrooms. But play is exactly what is needed. It is only play that can entice us to the type of repetition that is needed to learn how to inhabit the mathematical landscape and how to create new mathematics.
Friesen(2000) – unpublished thesis, cited in Stordy, Children Count, (2015)

# Play and practice

It is an appealing idea that as children play, they have opportunities to engage in repetition that is needed in mastering some mathematical skills. The other morning I decided to do some exploration of prime numbers and factorising even before I got out of bed. (Don’t judge me!). It was fun, and I discovered some interesting properties, and came up with a way of labelling numbers as having two, three and more dimensions. 12 is a three dimensional number, as is 20, whereas 35 and 77 are good examples of two dimensional numbers. As I was thus playing on my own, I was aware that it was practising my tables and honing my ability to think multiplicatively. In this instance the statement from Friesen made sense. I admit I’m not sure what it means to “create new mathematics”. Perhaps that is what I was doing with my 2 and 3 dimensional numbers.

You may be wondering what this has to do with teaching statistics to adults. Bear with…

## Traditional vs recent teaching methods for mathematics

Today on Twitter, someone asked what to do when a student says that they like being shown what to do, and then practising on textbook examples. This is the traditional method for teaching mathematics, and is currently not seen as ideal among many maths teachers (particularly those who inhabit the MathTwitterBlogosphere or MTBoS, as it is called). There is strong support for a more investigative, socially constructed approach to learning and teaching mathematics.  I realise that as a learner, I was happy enough learning maths by being shown what to do and then practising. I suspect a large proportion of maths teachers also liked doing that. Khan Academy videos are wildly popular with many learners and far too many teachers because they perpetuate this procedural view of mathematics. So is the procedural approach wrong? I think what it comes down to is what we are trying to teach. Were I to teach mathematics again I would not use “show then practise” as my modus operandi. I would like to teach children to become mathematicians rather than mathematical technicians. For this reason, the philosophies and methods of Youcubed, Dan Meyer and other MTBoS bloggers have appeal.

## Play and statistics

Now I want to turn my thoughts to statistics. Is there a need for more play in statistics? Can statistics be playful in the way that mathematics can be playful? Operations Research is just one game after another! Simulation, critical path, network analysis, travelling salesperson, knapsack problem? They are all big games. Probability is immensely playful, but what about statistical analysis? Can and should statistics be playful?

My first response is that there is no play in statistics. Statistics is serious and important, and deals with reality, not joyous abstract ideas like prime numbers and the Fibonacci series – and two and three dimensional numbers.

## The excitement of a fresh set of data

But there is that frisson of excitement as you finally finish cleaning your database and a freshly minted set of variables and observations beckons to you, with SPSS, SAS or even Excel at your fingertips. A new set of data is a new journey of discovery. Of course a serious researcher has already worked out a methodical route through her hypotheses… maybe. Or do we mostly all fossick about looking for patterns and insights, growing more and more familiar with the feel of the data, as if we were squeezing it through our fingers? So yes – my experience of data exploration is playful. It is an adventure, with wrong turns, forgetting the path, starting again, finding something only to lose it again and finally saying “enough” and taking a break, not because the data has been exhausted, but because I am.

## Writing the report is like cleaning up

Writing up statistical analysis is less exciting. It feels like picking up the gardening tools and putting them away after weeding the garden. Or cleaning the paintbrushes after creating a masterpiece. That was not one of my strengths – finishing and tidying up afterwards. The problem was that I felt I had finished when the original task had been completed – when the weeds had been pulled or the painting completed. In my view, cleaning and putting away the tools was an afterthought that dragged on after the completion of the task, and too often got ignored. Happily I have managed to change my behaviour by rethinking the nature of the weeding task. The weeding task is complete when the weeds are pulled and in the compost and the implements are resting clean and safe where they belong. Similarly a statistical analysis is not what comes before the report-writing, but is rather the whole process, ending when the report is complete, and the data is carefully stored away for another day. I wonder if that is the message we give our students – a thought for another post.

# Can statistics be playful?

For I have not yet answered the question. Can statistics be playful in the way that mathematics can be playful? We want to embed play in order to make our task of repetition be more enjoyable, and learning statistics requires repetition, in order to develop skills and learn to differentiate the universal from the individual. One problem is that statistics can seem so serious. When we use databases about global warming, species extinction, cancer screening, crime detection, income discrepancies and similarly adult topics, it can seem almost blasphemous to be too playful about it.

I suspect that one reason our statistics videos on YouTube are so popular is because they are playful.

Helen has an attitude problem

Helen has a real attitude problem and hurls snarky comments at her brother, Luke. The apples fall in an odd way, and Dr Nic pops up in strange places. This playfulness keeps the audience engaged in a way that serious, grown up themes may not. This is why we invented Ear Pox in our video about Risk and screening, because being playful about cancer is inappropriate.

Ear Pox is imaginary disease for which we are studying the screening risk.

A set of 240 Dragonistics data cards provides light-hearted data which yields satisfying results.

When I began this post I did not intend to bring it around to the videos and the Dragonistics data cards, but I have ended up there anyway. Maybe that is the appeal of the Dragonistics data cards –  that they avoid the gravitas of true and real grown-up data, and maintain a playfulness that is more engaging than reality. There is a truthiness about them – the two species – green and red dragons are different enough to present as different animal species, and the rules of danger and breath-type make sense. But students may happily play with the dragon cards without fear of ignorance or even irreverence of a real-life context.

What started me thinking about play with regards to learning maths and statistics is our Cat Maths cards. There are just so many ways to play with them that I can see Cat Maths cards playing an integral part in a junior primary classroom. This is why we created them and want them to make their way into classrooms. Sadly, our Kickstarter campaign was unsuccessful, but we hope to work with an established game manufacturer to bring them to the market by the end of 2017.

And maybe we need to be thinking a little more about the role of play in learning statistics – even for adults! What do you think? Can and should statistics be playful? And for what age group? Do you find statistical analysis fun?

# What does it mean to understand statistics?

It is possible to get a passing grade in a statistics paper by putting numbers into formulas and words into memorised phrases. In fact I suspect that this is a popular way for students to make their way through a required and often unwanted subject.

Most teachers of statistics would say that they would like students to understand what they are doing. This was a common sentiment expressed by participants in the excellent MOOC, Teaching statistics through data investigations (which is currently running again in January to May 2016.)

# Understanding

This makes me wonder what it means for students to understand statistics. There are many levels to understanding things. The concept of understanding has many nuances. If a person understands English, it means that they can use English with proficiency. If they are native speakers they may have little understanding of how grammar works, but they can still speak with correct grammar. We talk about understanding how a car works. I have no idea how a car works, apart from some idea that it requires petrol and the pistons go really, really fast. I can name parts of a car engine, such as distributor and drive shaft. But that doesn’t stop me from driving a car.

# Understanding statistics

I propose that when we talk about teaching students to understand statistics, we want our students to know why they are doing something, and have an idea of how it works. Students also need to be fluent in the language of statistics. I would not expect any student of an introductory or high school statistics class to be able to explain how least squares regression works in terms of matrix algebra, but I would expect them to have an idea that the fitted line in a bivariate plot is a model that minimises the squared error terms. I’m not sure anyone needs to know why “degrees of freedom” are called that – or even really what degrees of freedom do. These days computer packages look after degrees of freedom for us. We DO need to understand what a p-value is, and what it is telling us. For many people it is not necessary to know how a p-value is calculated.

# Ways to teach statistics

There are several approaches to teaching statistics. The approach needs to be tailored to the students and the context of the course. I prefer a hands-on, conceptual approach rather than a mathematical one. In current literature and practice there is a push for learning through investigations, often based around the statistical inquiry cycle. The problem with one long project is that students don’t get opportunities to apply principles in different situations, in such a way that will help in transfer of learning to other situations. There are some people who still teach statistics through the mathematical formulas, but I fear they are missing out on the opportunity to help students really enjoy statistics.

I do not propose to have all the answers, but we did discover one way to help students learn, alongside other methods. This approach is to use a short video, followed by a ten question true/false quiz. The quiz serves to reinforce and elaborate on concepts taught in the video, challenge students’ misconceptions, and help students be more familiar with the vocabulary and terminology of statistics. The quizzes we develop have multiple questions that randomise to give students the opportunity to try multiple times which seems to help understanding.

This short and entertaining video gives an illustration of how you can use videos and quizzes to help students learn difficult concepts.

And here is a link to a listing of all our videos and how you can get access to them. Statistics Learning Centre Videos

# Understanding Statistical Inference

Inference is THE big idea of statistics. This is where people come unstuck. Most people can accept the use of summary descriptive statistics and graphs. They can understand why data is needed. They can see that the way a sample is taken may affect how things turn out. They often understand the need for control groups. Most statistical concepts or ideas are readily explainable. But inference is a tricky, tricky idea. Well actually – it doesn’t need to be tricky, but the way it is generally taught makes it tricky.

## Procedural competence with zero understanding

I cast my mind back to my first encounter with confidence intervals and hypothesis tests. I learned how to calculate them (by hand  – yes I am that old) but had not a clue what their point was. Not a single clue. I got an A in that course. This is a common occurrence. It is possible to remain blissfully unaware of what inference is all about, while answering procedural questions in exams correctly.

But, thanks to the research and thinking of a lot of really smart and dedicated statistics teachers, we are able put a stop to that. And we must.

We need to explicitly teach what statistical inference is. Students do not learn to understand inference by doing calculations. We need to revisit the ideas behind inference frequently. The process of hypothesis testing, is counter-intuitive and so confusing that it spills its confusion over into the concept of inference. Confidence intervals are less confusing so a better intermediate point for understanding statistical inference. But we need to start with the concept of inference.

# What is statistical inference?

The idea of inference is actually not that tricky if you unbundle the concept from the application or process.

The concept of statistical inference is this –

We want to know stuff about a large group of people or things (a population). We can’t ask or test them all so we take a sample. We use what we find out from the sample to draw conclusions about the population.

That is it. Now was that so hard?

# Developing understanding of statistical inference in children

I have found the paper by Makar and Rubin, presenting a “framework for thinking about informal statistical inference”, particularly helpful. In this paper they summarise studies done with children learning about inference. They suggest that “ three key principles … appeared to be essential to informal statistical inference: (1) generalization, including predictions, parameter estimates, and conclusions, that extend beyond describing the given data; (2) the use of data as evidence for those generalizations; and (3) employment of probabilistic language in describing the generalization, including informal reference to levels of certainty about the conclusions drawn.” This can be summed up as Generalisation, Data as evidence, and Probabilistic Language.

We can lead into informal inference early on in the school curriculum. The key Ideas in the NZ curriculum suggest that “ teachers should be encouraging students to read beyond the data. Eg ‘If a new student joined our class, how many children do you think would be in their family?’” In other words, though we don’t specifically use the terms population and sample, we can conversationally draw attention to what we learn from this set of data, and how that might relate to other sets of data.

When teaching adults we may use a more direct approach, explaining explicitly, alongside experiential learning to understanding inference. We have just completed made a video: Understanding Inference. Within the video we have presented three basic ideas condensed from the Five Big Ideas in the very helpful book published by NCTM, “Developing Essential Understanding of Statistics, Grades 9 -12”  by Peck, Gould and Miller and Zbiek.

## Ideas underlying inference

• A sample is likely to be a good representation of the population.
• There is an element of uncertainty as to how well the sample represents the population
• The way the sample is taken matters.

These ideas help to provide a rationale for thinking about inference, and allow students to justify what has often been assumed or taught mathematically. In addition several memorable examples involving apples, chocolate bars and opinion polls are provided. This is available for free use on YouTube. If you wish to have access to more of our videos than are available there, do email me at n.petty@statslc.com.

# Introducing Probability

I have a guilty secret. I really love probability problems. I am so happy to be making videos about probability just now, and conditional probability and distributions and all that fun stuff. I am a little disappointed that we won’t be doing decision trees with Bayesian review, calculating EVPI. That is such fun, but I gave up teaching that some years ago.

The reason probability is fun is because it is really mathematics, and puzzles and logic. I love permutations and combinations too – there is something cool about working out how many ways something can happen.

So why should I feel guilty? Well, in all honesty I have to admit that there is very little need for most of that in a course about statistics at high-school or entry level university. When I taught statistical methods for management, we did some probability, but only from an applied viewpoint, and we never touched intersection and union signs or anything like that. We applied some distributions, but without much theoretical underpinning.

The GAISE (Guidelines for Assessment and Instruction in Statistics Education) Report says, “Teachers and students must understand that statistics and probability are not the same. Statistics uses probability, much as physics uses calculus.”

The question is, why do we teach probability – apart from the fact that it’s fun and makes a nice change from writing reports on time series and bivariate analysis, inference and experiments. The GAISE report also says, “Probability is an important part of any mathematical education. It is a part of mathematics that enriches the subject as a whole by its interactions with other uses of mathematics. Probability is an essential tool in applied mathematics and mathematical modeling. It is also an essential tool in statistics.”

The concept of probability is as important as it is misunderstood. It is vital to have an understanding of the nature of chance and variation in life, in order to be a well-informed, (or “efficient”) citizen. One area in which this is extremely important is in understanding risk and relative risk. When a person is told that their chances of dying of some rare disease have just doubled, it is important that they know that it may be because they have gone from one chance in a million to two chances in a million. Sure it has doubled, but it still is pretty trivial. An understanding of probability is also important in terms of gambling and resistance to the allures of games of chance. And more socially acceptable gambling, such as stockmarket trading, also requires an understanding of chance and variation.

The concept of probability is important, and a few rules of probability may help with understanding, but I suspect the mathematicians get carried away and create problems that are unlikely (probability close to zero) to ever occur in reality. Anything requiring a three-way Venn Diagram has moved from applied problem to logic puzzle.This is in stark contrast to the very applied data-driven approach used in teaching statistics in New Zealand.

## Teaching Probability

The traditional approach to teaching probability is to start with the coin and the dice and the balls in the urns. As well as being mind-bogglingly boring and pointless, this also projects an artificial certainty about the probabilities, which is confusing when we start discussing models. If you look at the Khan Academy videos (but don’t) you will find trivial examples about coloured balls or sweets or strangely complex problems involving hitting a circular target. The traditional approach is also to teach probability as truth. “The probability of getting a boy is one-half”. What does that even mean?

I am currently reading the new Springer volume, Probabilistic Thinking, and intend to write a review and post it on this blog, if I can get through enough before my review copy expires. It is inspiring and surprisingly gripping (but I don’t think that is enough of a review to earn me a hard copy to keep.). There are many great ideas for teaching in it, that I hope to pass on in due time.

The New Zealand approach to teaching probability comes from a modelling perspective, right from the start. At level 1, the first two years of schooling, children are exploring chance situations, playing games with a chance element and describing possible outcomes. By years 5 and 6 they are assigning numeric values to the likelihood of an occurrence. They (in the curriculum) are being introduced to model estimates and experimental estimates of probability. Bearing in mind how difficult high school maths teachers are finding the new approach, I don’t have a lot of confidence that the primary teachers are equipped yet to make the philosophical changes, let alone enact them in the classroom.

# Mathematicians teaching English

“I became a maths teacher so I wouldn’t have to mark essays”
“I’m having trouble getting the students to write down their own ideas”
“When I give them templates I feel as if it’s spoon-feeding them”

These are comments I hear as I visit mathematics teachers who are teaching the new statistics curriculum in New Zealand. They have a point. It is difficult for a mathematics teacher to teach in a different style. But – it can also be rewarding and interesting, and you never get asked, “Where is this useful?”

The statistical enquiry cycle provides a structure for all statistical investigations and learning.

We start with a problem or question, and undergo an investigation, either using extant data, an experiment or observational study to answer the question. Writing skills are key in several stages of the cycle. We need to be able to write an investigative question (or hypotheses). We need to write down a plan, and sometimes an entire questionnaire. We need to write down what we find in the analysis and we need to write a conclusion to answer the original question. That’s a whole heap of writing!

And for teachers who may not be all that happy about writing themselves, and students who chose mathematical subjects to avoid writing, it can be a bridge too far.
In previous posts on teaching report writing I promote the use of templates, and give some teaching suggestions.

In this post I am concentrating on analysing graphs, using a handy acronym, OSEM. OSEM was developed by Jeremy Brocklehurst from Lincoln High School near Christchurch NZ. There are other acronyms that would work just as well, but we like this one, not the least for its link with kiwi culture. We think it is awesome (OSEM). You could Google “o for awesome”, to get the background. OSEM stands for Obvious, Specific, Evidence and Meaning. It is a process to follow, rather than a checklist.

I like the use of O for obvious. I think students can be scared to say what they think might be too obvious, and look for tricky things. By including “obvious” in the process, it allows them to write about the important, and usually obvious features of a graph. I also like the emphasis on meaning, Unless the analysis of the data links back to the context and purpose of the investigation, it is merely a mathematical exercise.

Is this spoon-feeding? Far from it. We are giving students a structure that will help them to analyse any graph, including timeseries, scatter plots, and histograms, as well as boxplots and dotplots. It emphasises the use of quantitative information, linked with context. There is nothing revolutionary about it, but I think many statistics teachers may find it helpful as a way to breakdown and demystify the commenting process.

# Class use of OSEM

In a class setting, OSEM is a helpful framework for students to work in groups. Students individually (perhaps on personal whiteboards) write down something obvious about the graph. Then they share answers in pairs, and decide which one to carry on with. In the pair they specify and give evidence for their “obvious” statement. Then the pairs form groups of four, and they come up with statements of meaning, that are then shared with the class as a whole.

# Spoon feeding has its place

On a side-note – spoon-feeding is a really good way to make sure children get necessary nutrition until they learn to feed themselves. It is preferable to letting them starve before they get the chance to develop sufficient skills and co-ordination to get the food to their mouths independently.

# Those who can, teach statistics

The phrase I despise more than any in popular use (and believe me there are many contenders) is “Those who can, do, and those who can’t, teach.” I like many of the sayings of George Bernard Shaw, but this one is dismissive, and ignorant and born of jealousy. To me, the ability to teach something is a step higher than being able to do it. The PhD, the highest qualification in academia, is a doctorate. The word “doctor” comes from the Latin word for teacher.

Teaching is a noble profession, on which all other noble professions rest. Teachers are generally motivated by altruism, and often go well beyond the requirements of their job-description to help students. Teachers are derided for their lack of importance, and the easiness of their job. Yet at the same time teachers are expected to undo the ills of society. Everyone “knows” what teachers should do better. Teachers are judged on their output, as if they were the only factor in the mix. Yet how many people really believe their success or failure is due only to the efforts of their teacher?

For some people, teaching comes naturally. But even then, there is the need for pedagogical content knowledge. Teaching is not a generic skill that transfers seamlessly between disciplines. You must be a thinker to be a good teacher. It is not enough to perpetuate the methods you were taught with. Reflection is a necessary part of developing as a teacher. I wrote in an earlier post, “You’re teaching it wrong”, about the process of reflection. Teachers need to know their material, and keep up-to-date with ways of teaching it. They need to be aware of ways that students will have difficulties. Teachers, by sharing ideas and research, can be part of a communal endeavour to increase both content knowledge and pedagogical content knowledge.

There is a difference between being an explainer and being a teacher. Sal Khan, maker of the Khan Academy videos, is a very good explainer. Consequently many students who view the videos are happy that elements of maths and physics that they couldn’t do, have been explained in such a way that they can solve homework problems. This is great. Explaining is an important element in teaching. My own videos aim to explain in such a way that students make sense of difficult concepts, though some videos also illustrate procedure.

Teaching is much more than explaining. Teaching includes awakening a desire to learn and providing the experiences that will help a student to learn.  In these days of ever-expanding knowledge, a content-driven approach to learning and teaching will not serve our citizens well in the long run. Students need to be empowered to seek learning, to criticize, to integrate their knowledge with their life experiences. Learning should be a transformative experience. For this to take place, the teachers need to employ a variety of learner-focussed approaches, as well as explaining.

It cracks me up, the way sugary cereals are advertised as “part of a healthy breakfast”. It isn’t exactly lying, but the healthy breakfast would do pretty well without the sugar-filled cereal. Explanations really are part of a good learning experience, but need to be complemented by discussion, participation, practice and critique.  Explanations are like porridge – healthy, but not a complete breakfast on their own.

## Why statistics is so hard to teach

“I’m taking statistics in college next year, and I can’t wait!” said nobody ever!

Not many people actually want to study statistics. Fortunately many people have no choice but to study statistics, as they need it. How much nicer it would be to think that people were studying your subject because they wanted to, rather than because it is necessary for psychology/medicine/biology etc.

In New Zealand, with the changed school curriculum that gives greater focus to statistics, there is a possibility that one day students will be excited to study stats. I am impressed at the way so many teachers have embraced the changed curriculum, despite limited resources, and late changes to assessment specifications. In a few years as teachers become more familiar with and start to specialise in statistics, the change will really take hold, and the rest of the world will watch in awe.

In the meantime, though, let us look at why statistics is difficult to teach.

1. Students generally take statistics out of necessity.
2. Statistics is a mixture of quantitative and communication skills.
3. It is not clear which are right and wrong answers.
4. Statistical terminology is both vague and specific.
5. It is difficult to get good resources, using real data in meaningful contexts.
6. One of the basic procedures, hypothesis testing, is counter-intuitive.
7. Because the teaching of statistics is comparatively recent, there is little developed pedagogical content knowledge. (Though this is growing)
8. Technology is forever advancing, requiring regular updating of materials and teaching approaches.

On the other hand, statistics is also a fantastic subject to teach.

1. Statistics is immediately applicable to life.
2. It links in with interesting and diverse contexts, including subjects students themselves take.
3. Studying statistics enables class discussion and debate.
4. Statistics is necessary and does good.
5. The study of data and chance can change the way people see the world.
6. Technlogical advances have put the power for real statistical analysis into the hands of students.
7. Because the teaching of statistics is new, individuals can make a difference in the way statistics is viewed and taught.

I love to teach. These days many of my students are scattered over the world, watching my videos (for free) on YouTube. It warms my heart when they thank me for making something clear, that had been confusing. I realise that my efforts are small compared to what their teacher is doing, but it is great to be a part of it.

# The Knife-edge of Competence

I do my own video-editing using a very versatile and complex program called Adobe Premiere Pro. I have had no formal training, and get help by ringing my son, who taught me all I know and can usually rescue me with patient instructions over the phone. At times, especially in the early stages I have felt myself wobbling along the knife-edge of competence. All I needed was for something new to go wrong, or or click a button inadvertently and I would fall off the knife-edge and the whole project would disappear into a mass of binary. This was not without good reason. Premiere Pro wasn’t always stable on our computer, and at one point it took us several weeks to get our hard-drive replaced. (Apple “Time machine” saved me from despair). And sometimes I would forget to save regularly and a morning’s work was lost. (Even time-machine can’t help with that level of incompetence.)

But despite my severe limitations I have managed to edit over twenty videos that now receive due attention (and at times adulation!) on YouTube. It isn’t an easy feeling, to be teetering on the brink of disaster, real or imagined. But there was no alternative, and there is a sense of pride at having made it through with only a few scars and not too much inappropriate language.

There are some things at which I feel totally competent. I can speak to a crowd of any number of people and feel happy that they will be entertained, edified and perhaps even educated. I can analyse data using basic statistical methods. I can teach a person about inference. Performing these tasks is a joy, because I know I have the prerequisite skills and knowledge to cope with whatever happens. But on the way to getting to this point, I had to walk the knife-edge of competence.

Many teachers of statistics know too well this knife-edge. In New Zealand at present there are a large number of teachers of Year 13 statistics who are teaching about bootstrapping, when their own understanding of it is sketchy. They are teaching how to write statistical reports, when they have never written one themselves. They are assessing statements about statistics that they are not actually sure about. This is a knife-edge. They feel that any minute a student will ask them a question about the content that they cannot answer. These are not beginning teachers, but teachers with years and decades of experience in teaching mathematics and mathematical statistics. But the innovations of the curriculum have put them in an uncomfortable position. Inconsistent, tardy and even incorrect information from the qualification agency is not helping, but that is a story for another day.

In another arena there are professors and lecturers of statistics (in the antipodes we do not throw around the title “professor” with the abandon of our North American cousins) who are extremely competent at statistical mathematics and analysis but who struggle to teach in a satisfactory way. Their knife-edge concerns teaching, appropriate explanation and the generation of effective learning activities and assessments in the absence of any educational training. They fear that someone will realise one day that they don’t really know how to devise learning objectives, and provide fair assessments. I am hoping that this blog is going some way to helping these people to ask for help! Unfortunately the frequent response is avoidance behaviour, which is alarmingly supported by a system that rewards research publications rather than effective educational endeavours.

So what do you do when you are walking the knife-edge of competence?

# You do the best you can.

## And sometimes you fake it.

I am led to believe there is a gender-divide on this. Some people are better at hiding their incompetence than others, and just about all the people I know like that are men. I had a classmate in my honours year who was at a similar level of competence to me, but he applied for jobs I wouldn’t have contemplated. The fear of being shown up as a fake, or not knowing EXACTLY what to do at any point stopped me from venturing. He horrified me further a few years later when he set up his own company. Nearly three decades, two children and a PhD later I am not so fastidious or “nice” in the Jane Austen meaning of the word. If I think I can probably learn how to do something in time to make a reasonable fist of it and not cause actual harm, I’m likely to have a go. Hence taking my redundancy and running!

When I first lectured in statistics for management,  I did not know much beyond what I was teaching. I lived in fear that someone would ask me a question that I couldn’t answer and I would be revealed as the fake I was. Well you know, it never happened! I even taught students who were statistics majors, who did know more than I, and post-graduate students in psychology and heads of mathematics departments, and my fears were never realised. In fact the stats students told me that they finally understood the central limit theorem, thanks to my nifty little exercise using dotplots on minitab. (Which was how I had finally understood the central limit theorem – or at least the guts of it.)

I’m guessing that this is probably true for most of the mathematics teachers who are worrying. Despite their fear, they have not been challenged or called out.

The teachers’ other unease is the feeling that they are not giving the best service to their students, and the students will suffer, miss out on scholarships, decide not to get a higher education and live their lives on the street.  I may be exaggerating a little here, but certainly few of us like to give a service that is less than what we are accustomed to. We feel bad when we do something that feels substandard.

There are two things I learned in my twenty years of lecturing that may help here:

We don’t know how students perceive what we do. Every now and again I would come out of a lecture with sweat trickling down my spine because something had gone wrong. It might be that in the middle of an explanation I had had second thoughts about it, changed tack, then realised I was right in the first-place and ended up confusing myself. Or perhaps part way through a worked example it was pointed out to me that there was a numerical error in line three. To me these were bad, bad things to happen. They undermined my sense of competence. But you know, the students seldom even noticed. What felt like the worst lecture of my life, was in fact still just fine.

The other thing I learned is that we flatter ourselves when we think how much difference our knowledge may make.  Now don’t get me wrong here – teachers make an enormous difference. People who become teachers do so because we want to help people. We want to make a difference in students’ lives. We often have a sense of calling. There may be some teachers who do it because they don’t know what else to do with their degree, but I like to think that most of us teachers teach because to not teach is unthinkable. I despise, to the point of spitting as I talk, the expression “Those who can, do, and those who can’t, teach.” One day when the mood takes me I will write a whole post about the noble art of teaching and the fallacy of that dismissive statement. My next statement is so important I will give it a paragraph of its own.

A teacher who teaches from love, who truly cares about what happens to their students, even if they are struggling on the knife-edge of competence will not ruin their students’ lives through temporary incompetence in an aspect of the curriculum.

There are many ways that a teacher can have devastating effects on their students, but being, for a short time, on the knife-edge of competence, is not one of them.

Take heart, keep calm and carry on!

# Difficult concepts in statistics

Recently someone asked: “I don’t suppose you’d like to blog a little on the pedagogical knowledge relevant to statistics teaching, would you? A ‘top five statistics student misconceptions (and what to do about them)’ would be kind of a nice thing to see …”

I wish it were that easy. Here goes:

# Things that I have found students find difficult to understand and what I have done about them.

## Observations

When I taught second year regression we would get students to collect data and fit their own multiple regressions. The interesting thing was that quite often students would collect unrelated data. The columns of the data would not be of the same observations. These students had made it all the way through first year statistics without really understanding about multivariate data.

So from them on when I taught about regression I would specifically begin by talking about observations (or data points) and explain how they were connected. It doesn’t hurt to be explicit. In the NZ curriculum materials for high school students are exercises using data cards which correspond to individuals from a database. This helps students to see that each card, which corresponds to a line of data, is one person or thing. In my video about Levels of measurement, I take the time to show this.

First suggestion is “Don’t assume”.  This applies to so much!

And this is also why it is vital that instructors do at least some of their own marking (grading). High school teachers are going, “Of course”. College professors – you know you ought to! The only way you find out what the students don’t understand, or misunderstand, or replicate by rote from your own notes, is by reading what they write. This is tedious, painful and sometimes funny in a head-banging sort of way, but necessary. I also check the prevalence of answers to multiple choice questions in my on-line materials. If there is a distracter scoring highly it is worthwhile thinking about either the question or the teaching that is leading to incorrect responses.

## Inference

Well duh! Inference is a really, really difficult concept and is the key to inferential statistics. The basic idea, that we use information from a sample to draw conclusions about the population seems straight-forward. But it isn’t. Students need lots and lots of practice at identifying what is the population and what is the sample in any given situation. This needs to be done with different types of observations, such as people, commercial entities, plants or animals, geographical areas, manufactured products, instances of a physical experiment (Barbie bungee jumping), and times.

Second suggestion is “Practice”. And given the choice between one big practical project and a whole lot of small applied exercises, I would go with the exercises. A big real-life project is great for getting an idea of the big picture, and helping students to learn about the process of statistical analysis. But the problem with one big project is that it is difficult to separate the specific from the general. Context is at the core of any analysis in statistics, and makes every analysis different. Learning occurs through experiencing many different contexts and from them extracting what is general to all analysis, what is common to many analyses and what is specific to that example. The more different examples a student is exposed to, the better opportunity they have for constructing that learning. An earlier post extols the virtues of practice, even drill!

## Connections

One of the most difficult things is for students to make connections between parts of the curriculum. A traditional statistics course can seem like a toolbox of unrelated but confusingly different techniques. It takes a high level of understanding to link the probability, data and evidence aspects together in a meaningful way. It is good to have exercises that hep students to make these connections. I wrote about this with regard to Operations Research and Statistics. But students need also to be making connections before they get to the end of the course.

The third suggestion is “get students to write”

Get students to write down what is the same and what is different between chi-sq analysis and correlation. Get them to write down how a poisson distribution is similar to and different from a binomial distribution. Get them to write down how bar charts and histograms are similar and different. The reason students must write is that it is in the writing that they become aware of what they know or don’t know. We even teach ourselves things as we write.

## Graphs and data

Another type of connection that students have trouble with is that between the data and the graph, and in particular identifying variation and distribution in a histogram or similar. There are many different graphs, that can look quite similar, and students have problems identifying what is going on. The “value graph” which is produced so easily in Excel does nothing to help with these problems. I wrote a full post on the problems of interpreting graphs.

The fourth suggestion is “think hard”. (or borrow)

Teaching statistics is not for wusses. We need to think really hard about what students are finding difficult, and come up with solutions. We need to experiment with different ways of explaining and teaching. One thing that has helped my teaching is the production of my videos. I wish to use both visual and text (verbal) inputs as best as possible to make use of the medium. I have to think of ways of representing concepts visually, that will help both understanding and memory. This is NOT easy, but is extremely rewarding. And if you are not good at thinking up new ideas, borrow other people’s ideas. A good idea collector can be as good as or better than a good creator of ideas.

To think of a fifth suggestion I turned to my favourite book , “The Challenge of Developing Statistical Literacy, Reasoning and Thinking”, edited by Dani Ben-Zvi and Joan Garfield. I feel somewhat inadequate in the suggestions given above. The book abounds with studies that have shown areas of challenge or students and teachers. It is exciting that so many people are taking seriously the development of pedagogical content knowledge regarding the discipline of statistics. Some statisticians would prefer that the general population leave statistics to the experts, but they seem to be in the minority. And of course it depends on what you define “doing statistics” to mean.

But the ship of statistical protectionism has sailed, and it is up to statisticians and statistical educators to do our best to teach statistics in such a way that each student can understand and apply their knowledge confidently, correctly and appropriately.