Why people hate statistics

This summer/Christmas break it has been my pleasure to help a young woman who is struggling with statistics, and it has prompted me to ask people who teach postgraduate statistical methods – WTF are you doing?

Louise (name changed) is a bright, hard-working young woman, who has finished an undergraduate degree at a prestigious university and is now doing a Masters degree at a different prestigious university, which is a long way from where I live and will remain nameless. I have been working through her lecture slides, past and future and attempting to develop in her some confidence that she will survive the remainder of the course, and that statistics is in fact fathomable.

Incomprehensible courses alienating research students

After each session with Louise I have come away shaking my head and wondering what this lecturer is up to. I wonder if he/she really understands statistics or is just passing on their own confusion. And the very sad thing is that I KNOW that there are hundreds of lecturers in hundreds of similar courses around the world teaching in much the same way and alienating thousands of students every year.

And they need to stop.

Here is the approach: You have approximately eight weeks, made up of four hour sessions, in which to teach your masters students everything they could possibly need to know about statistics. So you tell them everything! You use technical terms with little explanation, and you give no indication of what is important and what is background. You dive right in with no clear purpose, and you expect them to keep up.

Choosing your level

Frequently Louise would ask me to explain something and I would pause to think. I was trying to work out how deep to go. It is like when a child asks where babies come from. They may want the full details, but they may not, and you need to decide what level of answer is most appropriate. Anyone who has seen our popular YouTube videos will be aware that I encourage conceptual understanding at best, and the equivalent of a statistics drivers licence at worst. When you have eight weeks to learn everything there is to know about statistics, up to and including multiple regression, logistic regression, GLM, factor analysis, non-parametric methods and more, I believe the most you can hope for is to be able to get the computer to run the test, and then make intelligent conclusions about the output.

There was nothing in the course about data collection, data cleaning, the concept of inference or the relationship between the model and reality. My experience is that data cleaning is one of the most challenging parts of analysis, especially for novice researchers.

Use learning objectives

And maybe one of the worst problems with Louise’s course was that there were no specific learning objectives. One of my most popular posts is on the need for learning objectives. Now I am not proposing that we slavishly tell students in each class what it is they are to learn, as that can be tedious and remove the fun from more discovery style learning. What I am saying is that it is only fair to tell the students what they are supposed to be learning. This helps them to know what in the lecture is important, and what is background. They need to know whether they need to have a passing understanding of a test, or if they need to be able to run one, or if they need to know the underlying mathematics.

Take for example, the t-test. There are many ways that the t-statistic can be used, so simply referring to a test as a t-test is misleading before you even start. And starting your teaching with the statistic is not helpful. We need to start with the need! I would call it a test for the difference of two means from two groups. And I would just talk about the t statistic in passing. I would give examples of output from various scenarios, some of which reject the null, some of which don’t and maybe even one that has a p-value of 0.049 so we can talk about that. In each case we would look at how the context affects the implications of the test result. In my learning objectives I would say: Students will be able to interpret the output of a test for the difference of two means, putting the result in context. And possibly, Students will be able to identify ways in which a test for the difference of two means violates the assumptions of a t-test. Now that wasn’t hard was it?

Like driving a car

Louise likes to understand where things come from, so we did go through an overview of how various distributions have been found to model different aspects of the world well – starting with the normal distribution, and with a quick jaunt into the Central Limit Theorem. I used my Dragonistics data cards, which were invented for teaching primary school, but actually work surprisingly well at all levels! I can’t claim that Louise understands the use of the t distribution, but I hope she now believes in it. I gave her the analogy of learning to drive – that we don’t need to know what is happening under the bonnet to be a safe driver. In fact safe driving depends more on paying attention to the road conditions and human behaviour.


Louise tells me that her lecturer emphasises assumptions – that the students need to examine them all, every time they look at or perform a statistical test. Now I have no problems with this later on, but students need to have some idea of where they are going and why, before being told what luggage they can and can’t take. And my experience is that assumptions are always violated. Always. As George Box put it – “All models are wrong and some models are useful.”

It did not help that the lecturer seemed a little confused about the assumption of normality. I am not one to point the finger, as this is a tricky assumption, as the Andy Field textbook pointed out. For example, we do not require the independent variables in a multiple regression to be normally distributed as the lecturer specified. This is not even possible if we are including dummy variables. What we do need to watch out for is that the residuals are approximately modelled by a normal distribution, and if not, that we do something about it.

You may have gathered that my approach to statistics is practical rather than idealistic. Why get all hot and bothered about whether you should do a parametric or non-parametric test, when the computer package does both with ease, and you just need to check if there is any difference in the result. (I can hear some purists hyperventilating at this point!) My experience is that the results seldom differ.

What post-graduate statistical methods courses should focus on

Instructors need to concentrate on the big ideas of statistics – what is inference, why we need data, how a sample is collected matters, and the relationship between a model and the reality it is modelling. I would include the concept of correlation, and its problematic link to causation. I would talk about the difference between statistical significance and usefulness, and evidence and strength of a relationship. And I would teach students how to find the right fishing lessons! If a student is critiquing a paper that uses logistical regression, that is the time they need to read up enough about logistical regression to be able to understand what they are reading.They cannot possibly learn a useful amount about all the tests or methods that they may encounter one day.

If research students are going to be doing their own research, they need more than a one semester fly-by of techniques, and would be best to get advice from a statistician BEFORE they collect the data.

Final word

So here is my take-home message:

Stop making graduate statistical methods courses so outrageously difficult by cramming them full of advanced techniques and concepts. Instead help students to understand what statistics is about, and how powerful and wonderful it can be to find out more about the world through data.

Your word

Am I right or is my preaching of the devil? Please add your comments below.

Talking in class: improving discussion in maths and stats classes

Maths is right or wrong – end of discussion  – or is it?

In 1984 I was a tutor in Operations Research to second year university students. My own experience of being in tutorials at University had been less than inspiring, with tutors who seemed reserved and keen to give us the answers without too much talking. I wanted to do a good job. My induction included a training session for teaching assistants from throughout the university. Margaret was a very experienced educational developer and was very keen for us to get the students discussing. I tried to explain to her that there really wasn’t a lot to discuss in my subject. You either knew how to solve a set of linear equations using Gauss-Jordan elimination or you didn’t. The answer was either correct or incorrect.

I suspect many people have this view of mathematics and its close relations, statistics and operations research.  Our classes have traditionally followed a set pattern. The teacher shows the class how to do something. The class copies down notes and some examples into their books, and then they individually work through exercises in the textbook – generally in silence. The teacher walks around the room and helps students as needed.

Prizes can help motivate students to give answers in unfamiliar settings

Prizes can help motivate students to give answers in unfamiliar settings

So when we talk about discussion in maths classes, this is not something that mathematics and statistics teachers are all familiar with. I recently gave a workshop for about 100 Scholarship students in Statistics in the Waikato. What a wonderful time we had together! The students were from all different schools and needed to be warmed up a little with prizes, but we had some good discussion in groups and as a whole. One of the teachers  commented later on the level of discussion in the session. Though she was an experienced maths teacher she found it difficult to lead discussion in the class. I am sure there are many like her.

It is important to talk in maths and stats classes

It is difficult for many students to learn in solitary silence. As we talk about a topic we develop our understanding, practice the language of the discipline and experience what it means to be a mathematician or statistician. Explaining ideas to others helps us to make sense of them ourselves. As we listen to other people’s thinking we can see how it relates to what we think, and can clear up misconceptions. Some people just like to talk, (who me?) and find learning more fun in a cooperative or collaborative environment. This recognition of the need for language and interaction underpins the development of “rich tasks” that are being used in mathematics classrooms throughout the world.

I have previously stated that “Maths learning should be communal and loud and exciting, not solitary, quiet and routine.”

Classroom atmosphere

One thing that was difficult at the Scholarship day was that the students did not know each other, and came from various schools. In a regular classroom the teacher has the opportunity of and responsibility for setting the tone of the class. Students need to feel safe. They need to feel that giving a wrong answer is not going to lead to ridicule. Several sessions at the start of the year may be needed to encourage discussion. Ideally this will become less necessary over time as students become used to interactive, inquiry-based learning in mathematics and statistics through their whole school careers.

Number talks” are a tool to help students improve their understanding of number, and recognise that there are many ways to see things. For example, the class might be shown a picture of dots and asked to explain how many dots they see, and how they worked it out. Several different ways of thinking will be discussed.

Children are encouraged to think up multiple ways of thinking about numbers  and to develop discussion by following prompts, sometimes called “talk moves”. Talk moves include revoicing, where the teacher restates what she thinks the student has said, asking students to restate another students reasoning, asking students to apply their own reasoning to someone else’s reasoning (Do you agree or disagree and why?), prompting for further participation (Would someone like to add on?), and using wait time (teachers should allow students to think for at least 10 seconds before calling on someone to answer. These are explained more fully in The Tools of Classroom talk.

Google Image is awash with classroom posters outlining “Talk moves”. I have been unable to trace back the source of the term or the list, and would be very pleased if someone can tell me the source,  to be able to attribute this structure.

Good questions

The essence of good discussion is good questions. Question ping pong is not classroom discussion. We have all experienced a teacher working through examples on the board, while asking students the answers to numerical questions. This is a control technique for keeping students attentive, but it can fall to a small group of students who are quick to answer. I remember doing just this in my tutorial on solving matrices, when I didn’t know any better.

Teachers should avoid asking questions that they already know the answers to.

It is not a hard-and-fast rule, but definitely a thing to think about. I like to use True/False quizzes to help uncover misconceptions, and develop use of statistical language. I just about always know the answer to the question, but what I don’t know is how many students know the answer. So  I ask the question not to know the answer, but to know if the students do, and to provoke discussion. Perhaps a more interesting question would be, how many students do you think will say “True” to this statement. It would then be interesting to find out their reasoning, so long as it does not get personal!

Multiple answers and open-ended questions

Where possible we need to ask questions that can have a number of acceptable answers. A discussion about what to do with outliers will seldom have a definitive answer, unless the answer is that it depends! Asking students to make a pictorial representation of an algebra problem can lead to interesting discussions.

The MathTwitterBlogosphere has many attractive ideas to use in teaching maths.

I rather like “Which one doesn’t belong”, which has echoes of “One of these things is not like the other, one of these things doesn’t belong, can you guess…” from Sesame Street. However, in Sesame Street the answer was usually unambiguous, whereas  with WODB there are lots of ways to have alternative answers. There is a website dedicated to sets of four objects, and the discussion is about which one does not belong. In each case all four can “not belong” for some reason, which I find a bit contrived, but it can lead to discussion about which is the strongest case of not belonging.

Whole class and group discussion

Some discussions work well for a whole class, while others are better in small groups or pairs. Matching or ordering paper slips with expressions can lead to great discussion. For example we could have a set of graphs of the same data, and order them according to how effective they are at communicating the aspects of the data. Or there could be statements of possible events and students can place them in order of likelihood. The discussion involved in ordering them helps students to clarify the nature of probability. Desmos has a facility for teachers to set up card matching or grouping exercises, which reduces the work and waste of paper.

Our own Dragonistics data cards are great for discussion. Students can be given a number of dragons (more than two) and decide which one is the best, or which one doesn’t belong, or how to divide the dragons fairly into two or more groups.

It can seem to be wasting time to have discussion. However the evidence from research is that good discussion is an effective way for students to learn mathematics and statistics. I challenge all maths and stats teachers to increase and improve the discussion in their class.

Play and learning mathematics and statistics

The role of play in learning

I have been reading further about teaching mathematics and came across this interesting assertion:

Play, understood as something frivolous, opposed to work, off-task behaviour, is not welcomed into most mathematics classrooms. But play is exactly what is needed. It is only play that can entice us to the type of repetition that is needed to learn how to inhabit the mathematical landscape and how to create new mathematics.
Friesen(2000) – unpublished thesis, cited in Stordy, Children Count, (2015)

Play and practice

It is an appealing idea that as children play, they have opportunities to engage in repetition that is needed in mastering some mathematical skills. The other morning I decided to do some exploration of prime numbers and factorising even before I got out of bed. (Don’t judge me!). It was fun, and I discovered some interesting properties, and came up with a way of labelling numbers as having two, three and more dimensions. 12 is a three dimensional number, as is 20, whereas 35 and 77 are good examples of two dimensional numbers. As I was thus playing on my own, I was aware that it was practising my tables and honing my ability to think multiplicatively. In this instance the statement from Friesen made sense. I admit I’m not sure what it means to “create new mathematics”. Perhaps that is what I was doing with my 2 and 3 dimensional numbers.

You may be wondering what this has to do with teaching statistics to adults. Bear with…

Traditional vs recent teaching methods for mathematics

Today on Twitter, someone asked what to do when a student says that they like being shown what to do, and then practising on textbook examples. This is the traditional method for teaching mathematics, and is currently not seen as ideal among many maths teachers (particularly those who inhabit the MathTwitterBlogosphere or MTBoS, as it is called). There is strong support for a more investigative, socially constructed approach to learning and teaching mathematics.  I realise that as a learner, I was happy enough learning maths by being shown what to do and then practising. I suspect a large proportion of maths teachers also liked doing that. Khan Academy videos are wildly popular with many learners and far too many teachers because they perpetuate this procedural view of mathematics. So is the procedural approach wrong? I think what it comes down to is what we are trying to teach. Were I to teach mathematics again I would not use “show then practise” as my modus operandi. I would like to teach children to become mathematicians rather than mathematical technicians. For this reason, the philosophies and methods of Youcubed, Dan Meyer and other MTBoS bloggers have appeal.

Play and statistics

Now I want to turn my thoughts to statistics. Is there a need for more play in statistics? Can statistics be playful in the way that mathematics can be playful? Operations Research is just one game after another! Simulation, critical path, network analysis, travelling salesperson, knapsack problem? They are all big games. Probability is immensely playful, but what about statistical analysis? Can and should statistics be playful?

My first response is that there is no play in statistics. Statistics is serious and important, and deals with reality, not joyous abstract ideas like prime numbers and the Fibonacci series – and two and three dimensional numbers.

The excitement of a fresh set of data

But there is that frisson of excitement as you finally finish cleaning your database and a freshly minted set of variables and observations beckons to you, with SPSS, SAS or even Excel at your fingertips. A new set of data is a new journey of discovery. Of course a serious researcher has already worked out a methodical route through her hypotheses… maybe. Or do we mostly all fossick about looking for patterns and insights, growing more and more familiar with the feel of the data, as if we were squeezing it through our fingers? So yes – my experience of data exploration is playful. It is an adventure, with wrong turns, forgetting the path, starting again, finding something only to lose it again and finally saying “enough” and taking a break, not because the data has been exhausted, but because I am.

Writing the report is like cleaning up

Writing up statistical analysis is less exciting. It feels like picking up the gardening tools and putting them away after weeding the garden. Or cleaning the paintbrushes after creating a masterpiece. That was not one of my strengths – finishing and tidying up afterwards. The problem was that I felt I had finished when the original task had been completed – when the weeds had been pulled or the painting completed. In my view, cleaning and putting away the tools was an afterthought that dragged on after the completion of the task, and too often got ignored. Happily I have managed to change my behaviour by rethinking the nature of the weeding task. The weeding task is complete when the weeds are pulled and in the compost and the implements are resting clean and safe where they belong. Similarly a statistical analysis is not what comes before the report-writing, but is rather the whole process, ending when the report is complete, and the data is carefully stored away for another day. I wonder if that is the message we give our students – a thought for another post.

Can statistics be playful?

For I have not yet answered the question. Can statistics be playful in the way that mathematics can be playful? We want to embed play in order to make our task of repetition be more enjoyable, and learning statistics requires repetition, in order to develop skills and learn to differentiate the universal from the individual. One problem is that statistics can seem so serious. When we use databases about global warming, species extinction, cancer screening, crime detection, income discrepancies and similarly adult topics, it can seem almost blasphemous to be too playful about it.

I suspect that one reason our statistics videos on YouTube are so popular is because they are playful.


Helen has an attitude problem

Helen has a real attitude problem and hurls snarky comments at her brother, Luke. The apples fall in an odd way, and Dr Nic pops up in strange places. This playfulness keeps the audience engaged in a way that serious, grown up themes may not. This is why we invented Ear Pox in our video about Risk and screening, because being playful about cancer is inappropriate.

Ear Pox is imaginary disease for which we are studying the screening risk.

Ear Pox is imaginary disease for which we are studying the screening risk.

Dragonistics data cards provide light-hearted data which yields worth-while results.

A set of 240 Dragonistics data cards provides light-hearted data which yields satisfying results.

When I began this post I did not intend to bring it around to the videos and the Dragonistics data cards, but I have ended up there anyway. Maybe that is the appeal of the Dragonistics data cards –  that they avoid the gravitas of true and real grown-up data, and maintain a playfulness that is more engaging than reality. There is a truthiness about them – the two species – green and red dragons are different enough to present as different animal species, and the rules of danger and breath-type make sense. But students may happily play with the dragon cards without fear of ignorance or even irreverence of a real-life context.

What started me thinking about play with regards to learning maths and statistics is our Cat Maths cards. There are just so many ways to play with them that I can see Cat Maths cards playing an integral part in a junior primary classroom. This is why we created them and want them to make their way into classrooms. You can help by supporting our Kickstarter crowdfunding campaign. Click the picture to pledge and get a box, provide a box for a school, or make a corporate donation.

We'd love your help.

We’d love your help.

Your thoughts about play and statistics

And maybe we need to be thinking a little more about the role of play in learning statistics – even for adults! What do you think? Can and should statistics be playful? And for what age group? Do you find statistical analysis fun?


The nature of mathematics and statistics and what it means to learn and teach them

I’ve been thinking lately….

Sometimes it pays to stop and think. I have been reading a recent textbook for mathematics teachers, Dianne Siemon et al, Teaching mathematics: foundations to middle years (2011). On page 47 the authors asked me to “Take a few minutes to write down your own views about the nature of mathematics, mathematics learning and mathematics teaching.” And bearing in mind I see statistics as related to, but not enclosed by mathematics, I decided to do the same for statistics as well. So here are my thoughts:

The nature of mathematics

Mathematicians love the elegance of mathematics

Mathematicians love the elegance of mathematics

Mathematics is a way of modelling and making sense of the world. Mathematics underpins scientific and commercial endeavours as well as everyday life. Mathematics is about patterns and proofs and problem structuring and solution finding. I used to think it was all about the answer, but now I think it is more about the process. I used to think that maths was predominantly an individual endeavour, but now I can see how there is a social or community aspect as well. I fear that too often students are getting a parsimonious view of mathematics, thinking it is only about numbers, and something they have to do on their own. I find my understanding of the nature of mathematics is rapidly changing as I participate in mathematics education at different ages and stages. I have also been influenced by the work of Jo Boaler.

To learn mathematics

My original idea of mathematics learning comes from my own successful experience of copying down notes from the board, listening to the teacher and doing the exercises in the textbook. I was not particularly fluent with my times-tables, but loved problem-solving. If I got something wrong, I was happy to try again until I nutted it out. Sometimes I even did recreational maths, like the time I enumerated all possible dice combinations in Risk to find out who had the advantage – attacker or defender. I always knew that it took practice to be good at mathematics. However I never really thought of mathematics as a social endeavour. I feel I missed out, now. From time to time I do have mathematical discussions with my colleague. It was an adventure inventing Rogo and then working out a solution method. Mathematics can be a social activity.

To teach mathematics

When I became a maths teacher I perpetuated the method that had worked for me, as I had not been challenged to think differently. I did like the ideas of mastery learning and personalised system of instruction. This meant that learners progressed to the next step only when they had mastered the previous one. I was a successful enough teacher and enjoyed my work.

Then as a university lecturer I had to work differently, and experimented. I had a popular personalised system of instruction quantitative methods course, relying totally on students working individually, at their own pace. I am happy that many of my students were successful in an area they had previously thought out of their reach. For some students it was the only subject they passed.

What I would do now

If I were to teach mathematics at school level again, I hope I would do things differently. I love the idea of “Number talks” and rich tasks which get students to think about different ways of doing things. I had often felt sad that there did not seem to be much opportunity to have discussions in maths, as things were either right or wrong. Now I see what fun we could have with open-ended tasks. Maths learning should be communal and loud and exciting, not solitary, quiet and routine. I have been largely constructivist in my teaching philosophy, but now I would like to try out social constructivist thinking.


And what about statistics? At school in the 1970s I never learned more than the summary statistics and basic probability. At uni level it was bewildering, but I managed to get an A grade in a first year paper without understanding any of the basic principles. It wasn’t until I was doing my honours year in Operations Research and was working as a tutor in Statistical methods that things stared to come together – but even then I was not at home with statistical ideas and was happy to leave them behind when I graduated.

The nature of statistics

Statistics lives in the real world

Statistics lives in the real world

My views now on the nature of statistics are quite different. I believe statistical thinking is related to mathematical thinking, but with less certainty and more mess. Statistics is about models of reality, based on imperfect and incomplete data. Much of statistics is a “best guess” backed up by probability theory. And statistics is SO important to empowered citizenship. There are wonderful opportunities for discussion in statistics classes. I had a fun experience recently with a bunch of Year 13 Scholarship students in the Waikato. We had collected data from the students, having asked them to interpret a bar chart and a pie chart. There were some outliers in the data and I got them to suggest what we should do about them. There were several good suggestions and I let them discuss for a while then moved on. One asked me what the answer was and I said I really couldn’t say – any one of their suggestions was valid. It was a good teaching and learning moment. Statistics is full of multiple good answers, and often no single, clearly correct, answer.

Learning statistics

My popular Quantitative Methods for Business course was developed on the premise that learning statistics requires repeated exposure to similar analyses of multiple contexts. In the final module, students did many, many hypothesis tests, in the hope that it would gradually fall into place. That is what worked for me, and it did seem to work for many of the students. I think that is not a particularly bad way to learn statistics. But there are possibly better ways.

I do like experiential learning, and statistics is perfect for real life experiences. Perhaps the ideal way to learn statistics is by performing an investigation from start to finish, guided by a knowledgeable tutor. I say perhaps, because I have reservations about whether that is effective use of time. I wrote a blog post previously, suggesting that students need exposure to multiple examples in order to know what in the study is universal and what applies only to that particular context. So perhaps that is why students at school should be doing an investigation each year within a different context.

The nature of understanding

This does beg the question of what it means to learn or to understand anything. I hesitate to claim full understanding. Of anything. Understanding is progressive and multi-faceted and functional. As we use a technique we understand it more, such as hypothesis testing or linear programming. Understanding is progressive. My favourite quote about understanding is from Moore and Cobb, that “Mathematical understanding is not the only understanding.” I do not understand the normal distribution because I can read the Gaussian formula. I understand it from using it, and in a different way from a person who can derive it. In this way my understanding is functional. I have no need to be able to derive the Gaussian function for what I do, and the nature and level of my understanding of the normal distribution, or multiple regression, or bootstrapping is sufficient for me, for now.

Teaching statistics

I believe our StatsLC videos do help students to understand and learn statistics. I have put a lot of work into those explanations, and have received overwhelmingly positive feedback about the videos. However, that is no guarantee, as Khan Academy videos get almost sycophantic praise and I know that there are plenty of examples of poor pedagogy and even error in them. I have recently been reading from “Make it Stick”, which summarises theory based on experimental research on how people learn for recall and retention. I was delighted to find that the method we had happened upon in our little online quizzes was promoted as an effective method of reinforcing learning.

Your thoughts

This has been an enlightening exercise, and I recommend it to anyone teaching in mathematics or statistics. Read the first few chapters of a contemporary text on how to teach mathematics. Dianne Siemon et al, Teaching mathematics: foundations to middle years (2011) did it for me. Then “take a few minutes to write down your own views about the nature of mathematics, mathematics learning and mathematics teaching.” To which I add my own suggestion to think about the nature of statistics or operations research. Who knows what you will find out. Maybe you could put a few of your ideas down in the comments.


Mathematics teaching Rockstar – Jo Boaler

Moving around the education sector

My life in education has included being a High School maths teacher, then teaching at university for 20 years. I then made resources and gave professional development workshops for secondary school teachers. It was exciting to see the new statistics curriculum being implemented into the New Zealand schools. And now we are making resources and participating in the primary school sector. It is wonderful to learn from each level of teaching. We would all benefit from more discussion across the levels.

Educational theory and idea-promoters

My father used to say (and the sexism has not escaped me) “Never run after a woman, a bus or an educational theory, as there will be another one along soon.” Education theories have lifespans, and some theories are more useful than others. I am not a fan of “learning styles” and fear they have served many students ill. However, there are some current ideas and idea-promoters in the teaching of mathematics that I find very attractive. I will begin with Jo Boaler, and intend to introduce you over the next few weeks to Dan Meyer, Carol Dweck and the person who wrote “Making it stick.”

Jo Boaler – Click here for official information

My first contact with Jo Boaler was reading “The Elephant in the Classroom.” In this Jo points out how society is complicit in the idea of a “maths brain”. Somehow it is socially acceptable to admit or be almost defensively proud of being “no good at maths”. A major problem with this is that her research suggests that later success in life is connected to attainment in mathematics. In order to address this, Jo explores a less procedural approach to teaching mathematics, including greater communication and collaboration.

Mathematical Mindsets

It is interesting to  see the effect Jo Boaler’s recent book, “Mathematical Mindsets “, is having on colleagues in the teaching profession. The maths advisors based in Canterbury NZ are strong proponents of her idea of “rich tasks”. Here are some tweets about the book:

“I am loving Mathematical Mindsets by @joboaler – seriously – everyone needs to read this”

“Even if you don’t teach maths this book will change how you teach for ever.”

“Hands down the most important thing I have ever read in my life”

What I get from Jo Boaler’s work is that we need to rethink how we teach mathematics. The methods that worked for mathematics teachers are not the methods we need to be using for everyone. The defence “The old ways worked for me” is not defensible in terms of inclusion and equity. I will not even try to boil down her approach in this post, but rather suggest readers visit her website and read the book!

At Statistics Learning Centre we are committed to producing materials that fit with sound pedagogical methods. Our Dragonistics data cards are perfect for use in a number of rich tasks. We are constantly thinking of ways to embed mathematics and statistics tasks into the curriculum of other subjects.

Challenges of implementation

I am aware that many of you readers are not primary or secondary teachers. There are so many barriers to getting mathematics taught in a more exciting, integrated and effective way. Primary teachers are not mathematics specialists, and may well feel less confident in their maths ability. Secondary mathematics teachers may feel constrained by the curriculum and the constant assessment in the last three years of schooling in New Zealand. And tertiary teachers have little incentive to improve their teaching, as it takes time from the more valued work of research.

Though it would be exciting if Jo Boaler’s ideas and methods were espoused in their entirety at all levels of mathematics teaching, I am aware that this is unlikely – as in a probability of zero. However, I believe that all teachers at all levels can all improve, even a little at a time. We at Statistics Learning Centre are committed to this vision. Through our blog, our resources, our games, our videos, our lessons and our professional development we aim to empower all teacher to teach statistics – better! We espouse the theories and teachings explained in Mathematical Mindsets, and hope that you also will learn about them, and endeavour to put them into place, whatever level you teach at.

Do tell us if Jo Boalers work has had an impact on what you do. How can the ideas apply at all levels of teaching? Do teachers need to have a growth mindset about their own ability to improve their teaching?

Here are some quotes to leave you with:

Mathematical Mindsets Quotes

“Many parents have asked me: What is the point of my child explaining their work if they can get the answer right? My answer is always the same: Explaining your work is what, in mathematics, we call reasoning, and reasoning is central to the discipline of mathematics.”
“Numerous research studies (Silver, 1994) have shown that when students are given opportunities to pose mathematics problems, to consider a situation and think of a mathematics question to ask of it—which is the essence of real mathematics—they become more deeply engaged and perform at higher levels.”
“The researchers found that when students were given problems to solve, and they did not know methods to solve them, but they were given opportunity to explore the problems, they became curious, and their brains were primed to learn new methods, so that when teachers taught the methods, students paid greater attention to them and were more motivated to learn them. The researchers published their results with the title “A Time for Telling,” and they argued that the question is not “Should we tell or explain methods?” but “When is the best time do this?”
“five suggestions that can work to open mathematics tasks and increase their potential for learning: Open up the task so that there are multiple methods, pathways, and representations. Include inquiry opportunities. Ask the problem before teaching the method. Add a visual component and ask students how they see the mathematics. Extend the task to make it lower floor and higher ceiling. Ask students to convince and reason; be skeptical.”

All quotes from

Jo Boaler, Mathematical Mindsets: Unleashing Students’ Potential through Creative Math, Inspiring Messages and Innovative Teaching

Teaching sampling with dragon data cards

Data cards for teaching statistics

Data cards are a wonderful way for students to get a feel for data. As a University lecturer in the 1990s, I found that students often didn’t understand about the multivariate nature of data. This may well be an artifact of the kind of statistics they studied at school, which was univariate (finding the confidence interval for the mean of a set of numbers) or bivariate at best. And back then, when statistical analysis was done by hand calculation, this was all you could expect. How times have changed!

At the NZAMT (NZ Association of Mathematics Teachers) conference in 2015, both Dick de Veaux and Rob Gould suggested in their keynote addresses that students need to be exposed to multivariate data. Rob endorsed the use of data cards to enable this. Data cards are a wonderful tool for all levels of learning. In the New Zealand “Figure it out” series, there are several lessons that use data cards, generally made by the students themselves. We were inspired by this and have developed a set of 240 data cards with information about dragons, to help teachers and students learn and be successful in their statistical endeavours. In an earlier post I discuss the pros and cons of fictional data.

The Dragonistics data cards are now available to purchase, and we have a range of supporting materials for lessons and activities at various levels. You can find out more about data cards by clicking on this link.

Teaching Sampling using Dragonistics data cards

A small sample of Dragonistics data cards

A small sample of Dragonistics data cards

The real advantage of using data cards to teach sampling is that it is difficult, and approaching prohibitive, to record and analyse all the information. When you have a spreadsheet of data on a computer, to take a sample is contrived and can confuse students. They wonder why you would not simply analyse all the data for the population.Physically collecting data can take more time than is practical. With the data cards, we know we cannot easily process the data from all 240 or 480 dragons (depending on how many boxes you use.) Sampling then becomes a sensible solution. Different groups of students take different samples, and perform their own analysis, leading to similar, but not identical results. This shows the concept of variation due to sampling in a concrete and memorable way.

Some decades ago I developed a set of counters of four different colours, with data with different means and standard deviations. I used these to teach about the concept of sampling, and the students did ANOVA analysis on them to see if the means of the four groups were the same. This was a good way to teach this principle. However there were two limitations. The first limitation is that the data is not multivariate. There are just two


The old technology – two variables, and no embedded context

variables, colour and the number. And the second limitation is that there was no context. I made up a context to go with it, something around sales I think, as this was for an MBA class, which partly overcame that problem.


I’d like to think that I have learned from all the reading, research, experience, seminars etc on how to teach statistics that I have participated in. Consequently, were I to teach an MBA Quantitative methods class again, I think I would use the Dragon data cards. We have recently produced this lesson plan, that teaches about the concept of sampling and variation due to sampling. Dragon data cards could also be used for teaching about the mechanics of sampling, such as stratification and systematic sampling. There needs to be a story behind the analysis or there is no point to the conclusion. In the lesson previously alluded to, the scenario is that we are building separate shelters for male and female dragons, and it would be useful to have an idea of the relative strengths of male and female dragons.

Evidence and Distribution

Using data cards gives a wonderful opportunity to explore the concepts of evidence and of distribution. The students lay out their cards in a nice bar chart arrangement, and say, “See  – there is a difference.” Teachers should then ask for evidence. Students need to be able to articulate what evidence there is for the effect they have observed, and place it in context. We have found this to be a useful process when teaching students of all levels.

With regard to distribution, if we work only with numbers, and find the medians of the two groups and observe that the median is higher for one group than the other, this is rather limited information. By observing the distribution of the dragon cards between the two sexes, we can see that there is overlap. It is not a clearcut difference. Additionally we may observe other effects, such as due to colour, which we might like to explore further in another journey around the Statistical Enquiry Cycle.

Data cards are a win

It is fascinating that the concept of data cards is so new. It seems like an obvious idea, and makes concrete some very tricky abstract ideas. Data cards are useful at almost any level of understanding. As the need for understanding of statistics grows, there has been an emphasis on finding out better ways to teach for understanding. Clearly data cards are a win!



Data for teaching – real, fake, fictional

There is a push for teachers and students to use real data in learning statistics. In this post I am going to address the benefits and drawbacks of different sources of real data, and make a case for the use of good fictional data as part of a statistical programme.

Here is a video introducing our fictional data set of 180 or 240 dragons, so you know what I am referring to.

Real collected, real database, trivial, fictional

There are two main types of real data. There is the real data that students themselves collect and there is real data in a dataset, collected by someone else, and available in its entirety. There are also two main types of unreal data. The first is trivial and lacking in context and useful only for teaching mathematical manipulation. The second is what I call fictional data, which is usually based on real-life data, but with some extra advantages, so long as it is skilfully generated. Poorly generated fictional data, as often found in case studies, is very bad for teaching.


When deciding what data to use for teaching statistics, it matters what it is that you are trying to teach. If you are simply teaching how to add up 8 numbers and divide the result by 8, then you are not actually doing statistics, and trivial fake data will suffice. Statistics only exists when there is a context. If you want to teach about the statistical enquiry process, then having the students genuinely involved at each stage of the process is a good idea. If you are particularly wanting to teach about fitting a regression line, you generally want to have multiple examples for students to use. And it would be helpful for there to be at least one linear relationship.

I read a very interesting article in “Teaching Children Mathematics” entitled, “Practıcal Problems: Using Literature to Teach Statistics”. The authors, Hourigan and Leavy, used a children’s book to generate the data on the number of times different characters appeared. But what I liked most, was that they addressed the need for a “driving question”. In this case the question was provided by a pre-school teacher who could only afford to buy one puppet for the book, and wanted to know which character appears the most in the story. The children practised collecting data as the story is read aloud. They collected their own data to analyse.

Let’s have a look at the different pros and cons of student-collected data, provided real data, and high-quality fictional data.

Collecting data

When we want students to experience the process of collecting real data, they need to collect real data. However real time data collection is time consuming, and probably not necessary every year. Student data collection can be simulated by a program such as The Islands, which I wrote about previously. Data students collect themselves is much more likely to have errors in it, or be “dirty” (which is a good thing). When students are only given clean datasets, such as those usually provided with textbooks, they do not learn the skills of deciding what to do with an errant data point. Fictional databases can also have dirty data, generated into it. The fictional inhabitants of The Islands sometimes lie, and often refuse to give consent for data collection on them.


One of the species of dragons included in our database

One of the species of dragons included in our database

I have heard that after a few years of school, graphs about cereal preference, number of siblings and type of pet get a little old. These topics, relating to the students, are motivating at first, but often there is no purpose to the investigation other than to get data for a graph.  Students need to move beyond their own experience and are keen to try something new. Data provided in a database can be motivating, if carefully chosen. There are opportunities to use databases that encourage awareness of social justice, the environment and politics. Fictional data must be motivating or there is no point! We chose dragons as a topic for our first set of fictional data, as dragons are interesting to boys and girls of most ages.

A meaningful  question

Here I refer again to that excellent article that talks about a driving question. There needs to be a reason for analysing the data. Maybe there is concern about food provided at the tuck shop, with healthy alternatives. Or can the question be tied into another area of the curriculum, such as which type of bean plant grows faster? Or can we increase the germination rate of seeds. The Census@school data has the potential for driving questions, but they probably need to be helped along. For existing datasets the driving question used by students might not be the same as the one (if any) driving the original collection of data. Sometimes that is because the original purpose is not ‘motivating’ for the students or not at an appropriate level. If you can’t find or make up a motivating meaningful question, the database is not appropriate. For our fictional dragon data, we have developed two scenarios – vaccinating for Pacific Draconian flu, and building shelters to make up for the deforestation of the island. With the vaccination scenario, we need to know about behaviour and size. For the shelter scenario we need to make decisions based on size, strength, behaviour and breath type. There is potential for a number of other scenarios that will also create driving questions.

Getting enough data

It can be difficult to get enough data for effects to show up. When students are limited to their class or family, this limits the number of observations. Only some databases have enough observations in them. There is no such problem with fictional databases, as you can just generate as much data as you need! There are special issues with regard to teaching about sampling, where you would want a large database with constrained access, like the Islands data, or the use of cards.


A problem with the data students collect is that it tends to be categorical, which limits the types of analysis that can be used. In databases, it can also be difficult to find measurement level data. In our fictional dragon database, we have height, strength and age, which all take numerical values. There are also four categorical variables. The Islands database has a large number of variables, both categorical and numerical.

Interesting Effects

Though it is good for students to understand that quite often there is no interesting effect, we would like students to have the satisfaction of finding interesting effects in the data, especially at the start. Interesting effects can be particularly exciting if the data is real, and they can apply their findings to the real world context. Student-collected-data is risky in terms of finding any noticeable relationships. It can be disappointing to do a long and involved study and find no effects. Databases from known studies can provide good effects, but unfortunately the variables with no effect tend to be left out of the databases, giving a false sense that there will always be effects. When we generate our fictional data, we make sure that there are the relationships we would like there, with enough interaction and noise. This is a highly skilled process, honed by decades of making up data for student assessment at university. (Guilty admission)


There are ethical issues to be addressed in the collection of real data from people the students know. Informed consent should be granted, and there needs to be thorough vetting. Young students (and not so young) can be damagingly direct in their questions. You may need to explain that it can be upsetting for people to be asked if they have been beaten or bullied. When using fictional data, that may appear real, such as the Islands data, it is important for students to be aware that the data is not real, even though it is based on real effects. This was one of the reasons we chose to build our first database on dragons, as we hope that will remove any concerns about whether the data is real or not!

The following table summarises the post.

Real data collected by the students Real existing database Fictional data
(The Islands, Kiwi Kapers, Dragons, Desserts)
Data collection Real experience Nil Sometimes
Dirty data Always Seldom Can be controlled
Motivating Can be Can be Must be!
Enough data Time consuming, difficult Hard to find Always
Meaningful question Sometimes. Can be trivial Can be difficult Part of the fictional scenario
Variables Tend towards nominal Often too few variables Generate as needed
Ethical issues Often Usually fine Need to manage reality
Effects Unpredictable Can be obvious or trivial, or difficult Can be managed

Engaging students in learning statistics using The Islands.

Three Problems and a Solution

Modern teaching methods for statistics have gone beyond the mathematical calculation of trivial problems. Computers can enable large size studies, bringing reality to the subject, but this is not without its own problems.

Problem 1: Giving students experience of the whole statistical process

There are many reasons for students to learn statistics through running their own projects, following the complete statistical enquiry process, posing a problem, planning the data collection, collecting and cleaning the data, analysing the data and drawing conclusions that relate back to the original problem. Individual projects can be both time consuming and risky, as the quality of the report, and the resultant grade can be dependent on the quality of the data collected, which may be beyond the control of the student.

The Statistical Enquiry Cycle, which underpins the NZ statistics curriculum.

The Statistical Enquiry Cycle, which underpins the NZ statistics curriculum.

Problem 2: Giving students experience of different types of sampling

If students are given an existing database and then asked to sample from it, this can be confusing for student and sends the misleading message that we would not want to use all the data available. But physically performing a sample, based on a sampling frame, can be prohibitively time consuming.

Problem 3: Giving students experience conducting human experiments

The problem here is obvious. It is not ethical to perform experiments on humans simply to learn about performing experiments.

An innovative solution: The Islands virtual world.

I recently ran an exciting workshop for teachers on using The Islands. My main difficulty was getting the participants to stop doing the assigned tasks long enough to discuss how we might implement this in their own classrooms. They were too busy clicking around different villages and people, finding subjects of the right age and getting them to run down a 15degree slope – all without leaving the classroom.

The Island was developed by Dr Michael Bulmer from the University of Queensland and is a synthetic learning environment. The Islands, the second version, is a free, online, virtual human population created for simulating data collection.

The synthetic learning environment overcomes practical and ethical issues with applied human research, and is used for teaching students at many different levels. For a login, email james.baglin @ rmit.edu.au (without the spaces in the email address).

There are now approximately 34,000 inhabitants of the Islands, who are born, have families (or not) and die in a speeded up time frame where 1 Island year is equivalent to about 28 earth days. They each carry a genetic code that affects their health etc. The database is dynamic, so every student will get different results from it.

The Islanders

Some of the Islanders

Two magnificent features

To me the one of the two best features is the difficulty of acquiring data on individuals. It takes time for students to collect samples, as each subject must be asked individually, and the results recorded in a database. There is no easy access to the population. This is still much quicker than asking people in real-life (or “irl” as it is known on the social media.) It is obvious that you need to sample and to have a good sampling plan, and you need to work out how to record and deal with your data.

The other outstanding feature is the ability to run experiments. You can get a group of subjects and split them randomly into treatment and control groups. Then you can perform interventions, such as making them sit quietly or run about, or drink something, and then evaluate their performance on some other task. This is without requiring real-life ethical approval and informed consent. However, in a touch of reality the people of the Islands sometimes lie, and they don’t always give consent.

There are over 200 tasks that you can assign to your people, covering a wide range of topics. They include blood tests, urine tests, physiology, food and drinks, injections, tablets, mental tasks, coordination, exercise, music, environment etc. The tasks occur in real (reduced) time, so you are not inclined to include more tasks than are necessary. There is also the opportunity to survey your Islanders, with more than fifty possible questions. These also take time to answer, which encourages judicious choice of questions.


In the workshop we used the Islands to learn about sampling distributions. First each teacher took a sample of one male and one female and timed them running down a hill. We made (fairly awful) dotplots on the whiteboard using sticky notes with the individual times on them. Then each teacher took a sample and found the median time. We used very small samples of 7 each as we were constrained by time, but larger samples would be preferable. We then looked at the distributions of the medians and compared that with the distribution of our first sample. The lesson was far from polished, but the message was clear, and it gave a really good feel for what a sampling distribution is.

Within the New Zealand curriculum, we could also use The Islands to learn about bivariate relationships, sampling methods and randomised experiments.

In my workshop I had educators from across the age groups, and a primary teacher assured me that Year 4 students would be able to make use of this. Fortunately there is a maturity filter so that you can remove options relating to drugs and sexual activity.

James Baglin from RMIT University has successfully trialled the Island with high school students and psychology research methods students. The owners of the Island generously allow free access to it. Thanks to James Baglin, who helped me prepare this post.

Here are links to some interesting papers that have been written about the use of The Islands in teaching. We are excited about the potential of this teaching tool.

Michael Bulmer and J. Kimberley Haladyn (2011) Life on an Island: a simulated population to support student projects in statistics. Technology Innovations in Statistics Education, 5(1). 

Huynh, Baglin, Bedford (2014) Improving the attitudes of high school students towards statistics: An Island-based approach. ICOTS9

Baglin, Reece, Bulmer and Di Benedetto, (2013) Simulating the data investigative cycle in less than two hours: using a virtual human population, cloud collaboration and a statistical package to engage students in a quantitative research methods course.

Bulmer, M. (2010). Technologies for enhancing project assessment in large classes. In C. Reading (Ed.), Proceedings of the Eighth International Conference on Teaching Statistics, July 2010. Ljubljana, Slovenia. Retrieved from http://www.stat.auckland.ac.nz/~iase/publications/icots8/ICOTS8_5D3_BULMER.pdf

Bulmer, M., & Haladyn, J. K. (2011). Life on an Island: A simulated population to support student projects in statistics. Technology Innovations in Statistics Education, 5. Retrieved from http://escholarship.org/uc/item/2q0740hv

Baglin, J., Bedford, A., & Bulmer, M. (2013). Students’ experiences and perceptions of using a virtual environment for project-based assessment in an online introductory statistics course. Technology Innovations in Statistics Education, 7(2), 1–15. Retrieved from http://www.escholarship.org/uc/item/137120mt

Learning to teach statistics, in a MOOC

I am participating in a MOOC, Teaching statistics through data investigations. A MOOC is a fancy name for an online, free, correspondence course.  The letters stand for Massive Open Online Course. I decided to enrol for several reasons. First I am always keen to learn new things. Second, I wanted to experience what it is like to be a student in a MOOC. And third I wanted to see what materials we could produce that might help teachers or learners of statistics in the US. We are doing well in the NZ market, but it isn’t really big enough to earn us enough money to do some of the really cool things we want to do in teaching statistics to the masses.

I am now up to Unit 4, and here is what I have learned so far:

Motivation and persistence

It is really difficult to stay motivated even in the best possible MOOC. Life gets in the way and there is always something more pressing than reading the materials, taking part in discussions and watching the videos. I looked up the rate of completion for MOOCs, and this article from IEEE gives the completion rate at 5%. Obviously it will differ between MOOCs, depending on the content, the style, the reward. I have found I am best to schedule time to apply to the MOOC each week, or it just doesn’t happen.

I know more than I thought I did

It is reassuring to find out that I really do have some expertise. (This may be a bit of a worry to those of you who regularly read my blog and think I am an expert in teaching statistics.) My efforts to read and ponder, to discuss and to experiment have meant that I do know more than teachers who are just beginning to teach statistics. Phew!

The investigative process matters

I finally get the importance of the Statistical Enquiry Cycle (PPDAC in New Zealand) or Statistical Investigation Cycle (Pose Collect, Analyse, Interpret in the US). I sort of got it before, but now it is falling into place. In the old-fashioned approach to teaching statistics, almost all the emphasis was on the calculations. There would be questions asking students to find the mean of a set of numbers, with no context. This is not statistics, but an arithmetic exercise. Unless a question is embedded in the statistical process, it is not statistics. There needs to be a reason, a question to answer, real data and a conclusion to draw. Every time we develop a teaching exercise for students, we need to think about where it sits in the process, and provide the context.

Brilliant questions

I was happy to participate in the LOCUS quiz to evaluate my own statistical understanding. I was relieved to get 100%. But I was SO impressed with the questions, which reflected the work and thinking that have produced them. I understand how difficult it is to write questions to teach and assess statistical understanding, as I have written hundreds of them myself. The FOCUS questions are great questions. I will be writing some of my own following their style. I loved the ones that asked what would be the best way to improve an experimental design. Inspired!

It’s easier to teach the number stuff

I’m sure I knew this, but to see so many teachers say it, cemented it in. Teacher after teacher commented that teaching procedure is so much easier than teaching concepts. Testing knowledge of procedure is so much easier than assessing conceptual understanding. Maths teachers are really good at procedure. That fluffy, hand-waving meaning stuff is just…difficult. And it all depends. Every answer depends! The implication of this is that we need to help teachers become more confident in helping students to learn the concepts of statistics. We need to develop materials that focus on the concepts. I’m pretty happy that most of my videos do just that – my “Understanding Confidence Intervals” is possibly the only video on confidence intervals that does not include a calculation or procedure.

You learn from other participants

I’ve never been keen on group work. I suspect this is true of most over-achievers. We don’t like to work with other people on assignments as they might freeload, or worse – drag our grade down. Over the years I’ve forced students to do group assignments, as they learn so much more in the process. And I hate to admit that I have also learned more when forced to do group assignments. It isn’t just about reducing the marking load. In this MOOC we are encouraged to engage with other participants through the discussion forums. This is an important part of on-line learning, particularly in a solely on-line platform (as opposed to blended learning). I just love reading what other people say. I get ideas, and I understand better where other people are coming from.

I have something to offer

It was pretty exciting to see my own video used as a resource in the course, and to hear from the instructor how she loves our Statistics Learning Centre videos.

What now?

I still have a few weeks to run on the MOOC and I will report back on what else I learn. And then in late May I am going to USCOTS (US Conference on Teaching Statistics). It’s going to cost me a bit to get there, living as I do in the middle of nowhere in Middle Earth. But I am thrilled to be able to meet with the movers and shakers in US teaching of statistics. I’ll keep you posted!

Teaching random variables and distributions

Why do we teach about random variables, and why is it so difficult to understand?

Probability and statistics go together pretty well and basic probability is included in most introductory statistics courses. Often maths teachers prefer the probability section as it is more mathematical than inference or exploratory data analysis. Both probability and statistics deal with the idea of uncertainty and chance, statistics mostly being about what has happened, and probability about what might happen. Probability can be, and often is, reduced to fun little algebraic puzzles, with little link to reality. But a sound understanding of the concept of probability and distribution, is essential to H.G. Wells’s “efficient citizen”.

When I first started on our series of probability videos, I wrote about the worth of probability. Now we are going a step further into the probability topic abyss, with random variables. For an introductory statistics course, it is an interesting question of whether to include random variables. Is it necessary for the future marketing managers of the world, the medical practitioners, the speech therapists, the primary school teachers, the lawyers to understand what a random variable is? Actually, I think it is. Maybe it is not as important as understanding concepts like risk and sampling error, but random variables are still important.

Random variables

Like many concepts in our area, once you get what a random variable is, it can be hard to explain. Now that I understand what a random variable is, it is difficult to remember what was difficult to understand about it. But I do remember feeling perplexed, trying to work out what exactly a random variable was. The lecturers use the term freely, but I remember (many decades ago) just not being able to pin down what a random variable is. And why it needed to exist.

To start with, the words “random variable” are difficult on their own. I have dedicated an entire post to the problems with “random”, and in the writing of it, discovered another inconsistency in the way that we use the word. When we are talking about a random sample, random implies equal likelihood. Yet when we talk about things happening randomly, they are not always equally likely. The word “variable” is also a problem. Surely all variables vary? Students may wonder what a non-random variable is – I know I did.

I like to introduce the idea of variables, as part of mathematical modelling. We can have a simple model:

Cost of event = hall hire + per capita charge x number of guests.

In this model, the hall hire and per capita charge are both constants, and the number of guests is a variable. The cost of the event is also a variable, and can be expressed as a function of the number of guests. And vice versa! Now if we know the number of guests, we can then calculate the cost of the event. But the number of guests may be uncertain – it could be something between 100 and 120. It is thus a random variable.

Another way to look at a random variable is to come from the other direction – start with the random part and add the variable part. When something random happens, sometimes the outcome is discrete and non-numerical, such as the sex of a baby, the colour of a tulip, or the type of fruit in a lunchbox. But when the random outcome is given a value, then it becomes a random variable.


Pictorial representation of different distributions

Pictorial representation of different distributions

Then we come to distributions. I fear that too often distributions are taught in such a way that students believe that the normal or bell curve is a property guiding the universe, rather than a useful model that works in many different circumstances. (Rather like Adam Smith’s invisible hand that economists worship.) I’m pretty sure that is what I believed for many years, in my fog of disconnected statistical concepts. Somewhat telling, is the tendency for examples to begin with the words, “The life expectancy of a particular brand of lightbulb is normally distributed with a mean of …” or similar. Worse still, they don’t even mention the normal distribution, and simply say “The mean income per household in a certain state is $9500 with a standard deviation of $1750. The middle 95% of incomes are between what two values?” Students are left to assume that the normal distribution will apply, which in the second case is only a very poor approximation as incomes are likely to be skewed. This sloppy question-writing perpetuates the idea of the normal distribution as the rule that guides the universe.

Take a look at the textbook you use, and see what language it uses when asking questions about the normal distribution. The two examples above are from a popular AP statistics test preparation text.

I thought I’d better take a look at what Khan Academy did to random variables. I started watching the first video and immediately got hit with the flipping coin and rolling dice. No, people – this is not the way to introduce random variables! No one cares how many coins are heads. And even worse he starts with a zero/one random variable because we are only flipping one coin. And THEN he says that he could define a head as 100 and tail as 703 and…. Sorry, I can’t take it anymore.

A good way to introduce random variables

After LOTS of thinking and explaining, and trying stuff out, I have come up with what I think is a revolutionary and fabulous way to introduce random variables and distributions. To begin with we use a discrete empirical distribution to illustrate the idea of a random variable. The random variable models the number of ice creams per customer.
Then we use that discrete distribution to teach about expected value and standard deviation, and combining random variables.The third video introduces the idea of families of distributions, and shows how different distributions can be used to model the same random process.

Another unusual feature, is the introduction of the triangular distribution, which is part of the New Zealand curriculum. You can read here about the benefits of teaching the triangular distribution.

I’m pretty excited about this approach to teaching random variables and distributions. I’d love some feedback about it!