Why people hate statistics

This summer/Christmas break it has been my pleasure to help a young woman who is struggling with statistics, and it has prompted me to ask people who teach postgraduate statistical methods – WTF are you doing?

Louise (name changed) is a bright, hard-working young woman, who has finished an undergraduate degree at a prestigious university and is now doing a Masters degree at a different prestigious university, which is a long way from where I live and will remain nameless. I have been working through her lecture slides, past and future and attempting to develop in her some confidence that she will survive the remainder of the course, and that statistics is in fact fathomable.

Incomprehensible courses alienating research students

After each session with Louise I have come away shaking my head and wondering what this lecturer is up to. I wonder if he/she really understands statistics or is just passing on their own confusion. And the very sad thing is that I KNOW that there are hundreds of lecturers in hundreds of similar courses around the world teaching in much the same way and alienating thousands of students every year.

And they need to stop.

Here is the approach: You have approximately eight weeks, made up of four hour sessions, in which to teach your masters students everything they could possibly need to know about statistics. So you tell them everything! You use technical terms with little explanation, and you give no indication of what is important and what is background. You dive right in with no clear purpose, and you expect them to keep up.

Choosing your level

Frequently Louise would ask me to explain something and I would pause to think. I was trying to work out how deep to go. It is like when a child asks where babies come from. They may want the full details, but they may not, and you need to decide what level of answer is most appropriate. Anyone who has seen our popular YouTube videos will be aware that I encourage conceptual understanding at best, and the equivalent of a statistics drivers licence at worst. When you have eight weeks to learn everything there is to know about statistics, up to and including multiple regression, logistic regression, GLM, factor analysis, non-parametric methods and more, I believe the most you can hope for is to be able to get the computer to run the test, and then make intelligent conclusions about the output.

There was nothing in the course about data collection, data cleaning, the concept of inference or the relationship between the model and reality. My experience is that data cleaning is one of the most challenging parts of analysis, especially for novice researchers.

Use learning objectives

And maybe one of the worst problems with Louise’s course was that there were no specific learning objectives. One of my most popular posts is on the need for learning objectives. Now I am not proposing that we slavishly tell students in each class what it is they are to learn, as that can be tedious and remove the fun from more discovery style learning. What I am saying is that it is only fair to tell the students what they are supposed to be learning. This helps them to know what in the lecture is important, and what is background. They need to know whether they need to have a passing understanding of a test, or if they need to be able to run one, or if they need to know the underlying mathematics.

Take for example, the t-test. There are many ways that the t-statistic can be used, so simply referring to a test as a t-test is misleading before you even start. And starting your teaching with the statistic is not helpful. We need to start with the need! I would call it a test for the difference of two means from two groups. And I would just talk about the t statistic in passing. I would give examples of output from various scenarios, some of which reject the null, some of which don’t and maybe even one that has a p-value of 0.049 so we can talk about that. In each case we would look at how the context affects the implications of the test result. In my learning objectives I would say: Students will be able to interpret the output of a test for the difference of two means, putting the result in context. And possibly, Students will be able to identify ways in which a test for the difference of two means violates the assumptions of a t-test. Now that wasn’t hard was it?

Like driving a car

Louise likes to understand where things come from, so we did go through an overview of how various distributions have been found to model different aspects of the world well – starting with the normal distribution, and with a quick jaunt into the Central Limit Theorem. I used my Dragonistics data cards, which were invented for teaching primary school, but actually work surprisingly well at all levels! I can’t claim that Louise understands the use of the t distribution, but I hope she now believes in it. I gave her the analogy of learning to drive – that we don’t need to know what is happening under the bonnet to be a safe driver. In fact safe driving depends more on paying attention to the road conditions and human behaviour.

Assumptions

Louise tells me that her lecturer emphasises assumptions – that the students need to examine them all, every time they look at or perform a statistical test. Now I have no problems with this later on, but students need to have some idea of where they are going and why, before being told what luggage they can and can’t take. And my experience is that assumptions are always violated. Always. As George Box put it – “All models are wrong and some models are useful.”

It did not help that the lecturer seemed a little confused about the assumption of normality. I am not one to point the finger, as this is a tricky assumption, as the Andy Field textbook pointed out. For example, we do not require the independent variables in a multiple regression to be normally distributed as the lecturer specified. This is not even possible if we are including dummy variables. What we do need to watch out for is that the residuals are approximately modelled by a normal distribution, and if not, that we do something about it.

You may have gathered that my approach to statistics is practical rather than idealistic. Why get all hot and bothered about whether you should do a parametric or non-parametric test, when the computer package does both with ease, and you just need to check if there is any difference in the result. (I can hear some purists hyperventilating at this point!) My experience is that the results seldom differ.

What post-graduate statistical methods courses should focus on

Instructors need to concentrate on the big ideas of statistics – what is inference, why we need data, how a sample is collected matters, and the relationship between a model and the reality it is modelling. I would include the concept of correlation, and its problematic link to causation. I would talk about the difference between statistical significance and usefulness, and evidence and strength of a relationship. And I would teach students how to find the right fishing lessons! If a student is critiquing a paper that uses logistical regression, that is the time they need to read up enough about logistical regression to be able to understand what they are reading.They cannot possibly learn a useful amount about all the tests or methods that they may encounter one day.

If research students are going to be doing their own research, they need more than a one semester fly-by of techniques, and would be best to get advice from a statistician BEFORE they collect the data.

Final word

So here is my take-home message:

Stop making graduate statistical methods courses so outrageously difficult by cramming them full of advanced techniques and concepts. Instead help students to understand what statistics is about, and how powerful and wonderful it can be to find out more about the world through data.

Your word

Am I right or is my preaching of the devil? Please add your comments below.

Talking in class: improving discussion in maths and stats classes

Maths is right or wrong – end of discussion  – or is it?

In 1984 I was a tutor in Operations Research to second year university students. My own experience of being in tutorials at University had been less than inspiring, with tutors who seemed reserved and keen to give us the answers without too much talking. I wanted to do a good job. My induction included a training session for teaching assistants from throughout the university. Margaret was a very experienced educational developer and was very keen for us to get the students discussing. I tried to explain to her that there really wasn’t a lot to discuss in my subject. You either knew how to solve a set of linear equations using Gauss-Jordan elimination or you didn’t. The answer was either correct or incorrect.

I suspect many people have this view of mathematics and its close relations, statistics and operations research.  Our classes have traditionally followed a set pattern. The teacher shows the class how to do something. The class copies down notes and some examples into their books, and then they individually work through exercises in the textbook – generally in silence. The teacher walks around the room and helps students as needed.

Prizes can help motivate students to give answers in unfamiliar settings

Prizes can help motivate students to give answers in unfamiliar settings

So when we talk about discussion in maths classes, this is not something that mathematics and statistics teachers are all familiar with. I recently gave a workshop for about 100 Scholarship students in Statistics in the Waikato. What a wonderful time we had together! The students were from all different schools and needed to be warmed up a little with prizes, but we had some good discussion in groups and as a whole. One of the teachers  commented later on the level of discussion in the session. Though she was an experienced maths teacher she found it difficult to lead discussion in the class. I am sure there are many like her.

It is important to talk in maths and stats classes

It is difficult for many students to learn in solitary silence. As we talk about a topic we develop our understanding, practice the language of the discipline and experience what it means to be a mathematician or statistician. Explaining ideas to others helps us to make sense of them ourselves. As we listen to other people’s thinking we can see how it relates to what we think, and can clear up misconceptions. Some people just like to talk, (who me?) and find learning more fun in a cooperative or collaborative environment. This recognition of the need for language and interaction underpins the development of “rich tasks” that are being used in mathematics classrooms throughout the world.

I have previously stated that “Maths learning should be communal and loud and exciting, not solitary, quiet and routine.”

Classroom atmosphere

One thing that was difficult at the Scholarship day was that the students did not know each other, and came from various schools. In a regular classroom the teacher has the opportunity of and responsibility for setting the tone of the class. Students need to feel safe. They need to feel that giving a wrong answer is not going to lead to ridicule. Several sessions at the start of the year may be needed to encourage discussion. Ideally this will become less necessary over time as students become used to interactive, inquiry-based learning in mathematics and statistics through their whole school careers.

Number talks” are a tool to help students improve their understanding of number, and recognise that there are many ways to see things. For example, the class might be shown a picture of dots and asked to explain how many dots they see, and how they worked it out. Several different ways of thinking will be discussed.

Children are encouraged to think up multiple ways of thinking about numbers  and to develop discussion by following prompts, sometimes called “talk moves”. Talk moves include revoicing, where the teacher restates what she thinks the student has said, asking students to restate another students reasoning, asking students to apply their own reasoning to someone else’s reasoning (Do you agree or disagree and why?), prompting for further participation (Would someone like to add on?), and using wait time (teachers should allow students to think for at least 10 seconds before calling on someone to answer. These are explained more fully in The Tools of Classroom talk.

Google Image is awash with classroom posters outlining “Talk moves”. I have been unable to trace back the source of the term or the list, and would be very pleased if someone can tell me the source,  to be able to attribute this structure.

Good questions

The essence of good discussion is good questions. Question ping pong is not classroom discussion. We have all experienced a teacher working through examples on the board, while asking students the answers to numerical questions. This is a control technique for keeping students attentive, but it can fall to a small group of students who are quick to answer. I remember doing just this in my tutorial on solving matrices, when I didn’t know any better.

Teachers should avoid asking questions that they already know the answers to.

It is not a hard-and-fast rule, but definitely a thing to think about. I like to use True/False quizzes to help uncover misconceptions, and develop use of statistical language. I just about always know the answer to the question, but what I don’t know is how many students know the answer. So  I ask the question not to know the answer, but to know if the students do, and to provoke discussion. Perhaps a more interesting question would be, how many students do you think will say “True” to this statement. It would then be interesting to find out their reasoning, so long as it does not get personal!

Multiple answers and open-ended questions

Where possible we need to ask questions that can have a number of acceptable answers. A discussion about what to do with outliers will seldom have a definitive answer, unless the answer is that it depends! Asking students to make a pictorial representation of an algebra problem can lead to interesting discussions.

The MathTwitterBlogosphere has many attractive ideas to use in teaching maths.

I rather like “Which one doesn’t belong”, which has echoes of “One of these things is not like the other, one of these things doesn’t belong, can you guess…” from Sesame Street. However, in Sesame Street the answer was usually unambiguous, whereas  with WODB there are lots of ways to have alternative answers. There is a website dedicated to sets of four objects, and the discussion is about which one does not belong. In each case all four can “not belong” for some reason, which I find a bit contrived, but it can lead to discussion about which is the strongest case of not belonging.

Whole class and group discussion

Some discussions work well for a whole class, while others are better in small groups or pairs. Matching or ordering paper slips with expressions can lead to great discussion. For example we could have a set of graphs of the same data, and order them according to how effective they are at communicating the aspects of the data. Or there could be statements of possible events and students can place them in order of likelihood. The discussion involved in ordering them helps students to clarify the nature of probability. Desmos has a facility for teachers to set up card matching or grouping exercises, which reduces the work and waste of paper.

Our own Dragonistics data cards are great for discussion. Students can be given a number of dragons (more than two) and decide which one is the best, or which one doesn’t belong, or how to divide the dragons fairly into two or more groups.

It can seem to be wasting time to have discussion. However the evidence from research is that good discussion is an effective way for students to learn mathematics and statistics. I challenge all maths and stats teachers to increase and improve the discussion in their class.

The nature of mathematics and statistics and what it means to learn and teach them

I’ve been thinking lately….

Sometimes it pays to stop and think. I have been reading a recent textbook for mathematics teachers, Dianne Siemon et al, Teaching mathematics: foundations to middle years (2011). On page 47 the authors asked me to “Take a few minutes to write down your own views about the nature of mathematics, mathematics learning and mathematics teaching.” And bearing in mind I see statistics as related to, but not enclosed by mathematics, I decided to do the same for statistics as well. So here are my thoughts:

The nature of mathematics

Mathematicians love the elegance of mathematics

Mathematicians love the elegance of mathematics

Mathematics is a way of modelling and making sense of the world. Mathematics underpins scientific and commercial endeavours as well as everyday life. Mathematics is about patterns and proofs and problem structuring and solution finding. I used to think it was all about the answer, but now I think it is more about the process. I used to think that maths was predominantly an individual endeavour, but now I can see how there is a social or community aspect as well. I fear that too often students are getting a parsimonious view of mathematics, thinking it is only about numbers, and something they have to do on their own. I find my understanding of the nature of mathematics is rapidly changing as I participate in mathematics education at different ages and stages. I have also been influenced by the work of Jo Boaler.

To learn mathematics

My original idea of mathematics learning comes from my own successful experience of copying down notes from the board, listening to the teacher and doing the exercises in the textbook. I was not particularly fluent with my times-tables, but loved problem-solving. If I got something wrong, I was happy to try again until I nutted it out. Sometimes I even did recreational maths, like the time I enumerated all possible dice combinations in Risk to find out who had the advantage – attacker or defender. I always knew that it took practice to be good at mathematics. However I never really thought of mathematics as a social endeavour. I feel I missed out, now. From time to time I do have mathematical discussions with my colleague. It was an adventure inventing Rogo and then working out a solution method. Mathematics can be a social activity.

To teach mathematics

When I became a maths teacher I perpetuated the method that had worked for me, as I had not been challenged to think differently. I did like the ideas of mastery learning and personalised system of instruction. This meant that learners progressed to the next step only when they had mastered the previous one. I was a successful enough teacher and enjoyed my work.

Then as a university lecturer I had to work differently, and experimented. I had a popular personalised system of instruction quantitative methods course, relying totally on students working individually, at their own pace. I am happy that many of my students were successful in an area they had previously thought out of their reach. For some students it was the only subject they passed.

What I would do now

If I were to teach mathematics at school level again, I hope I would do things differently. I love the idea of “Number talks” and rich tasks which get students to think about different ways of doing things. I had often felt sad that there did not seem to be much opportunity to have discussions in maths, as things were either right or wrong. Now I see what fun we could have with open-ended tasks. Maths learning should be communal and loud and exciting, not solitary, quiet and routine. I have been largely constructivist in my teaching philosophy, but now I would like to try out social constructivist thinking.

Statistics

And what about statistics? At school in the 1970s I never learned more than the summary statistics and basic probability. At uni level it was bewildering, but I managed to get an A grade in a first year paper without understanding any of the basic principles. It wasn’t until I was doing my honours year in Operations Research and was working as a tutor in Statistical methods that things stared to come together – but even then I was not at home with statistical ideas and was happy to leave them behind when I graduated.

The nature of statistics

Statistics lives in the real world

Statistics lives in the real world

My views now on the nature of statistics are quite different. I believe statistical thinking is related to mathematical thinking, but with less certainty and more mess. Statistics is about models of reality, based on imperfect and incomplete data. Much of statistics is a “best guess” backed up by probability theory. And statistics is SO important to empowered citizenship. There are wonderful opportunities for discussion in statistics classes. I had a fun experience recently with a bunch of Year 13 Scholarship students in the Waikato. We had collected data from the students, having asked them to interpret a bar chart and a pie chart. There were some outliers in the data and I got them to suggest what we should do about them. There were several good suggestions and I let them discuss for a while then moved on. One asked me what the answer was and I said I really couldn’t say – any one of their suggestions was valid. It was a good teaching and learning moment. Statistics is full of multiple good answers, and often no single, clearly correct, answer.

Learning statistics

My popular Quantitative Methods for Business course was developed on the premise that learning statistics requires repeated exposure to similar analyses of multiple contexts. In the final module, students did many, many hypothesis tests, in the hope that it would gradually fall into place. That is what worked for me, and it did seem to work for many of the students. I think that is not a particularly bad way to learn statistics. But there are possibly better ways.

I do like experiential learning, and statistics is perfect for real life experiences. Perhaps the ideal way to learn statistics is by performing an investigation from start to finish, guided by a knowledgeable tutor. I say perhaps, because I have reservations about whether that is effective use of time. I wrote a blog post previously, suggesting that students need exposure to multiple examples in order to know what in the study is universal and what applies only to that particular context. So perhaps that is why students at school should be doing an investigation each year within a different context.

The nature of understanding

This does beg the question of what it means to learn or to understand anything. I hesitate to claim full understanding. Of anything. Understanding is progressive and multi-faceted and functional. As we use a technique we understand it more, such as hypothesis testing or linear programming. Understanding is progressive. My favourite quote about understanding is from Moore and Cobb, that “Mathematical understanding is not the only understanding.” I do not understand the normal distribution because I can read the Gaussian formula. I understand it from using it, and in a different way from a person who can derive it. In this way my understanding is functional. I have no need to be able to derive the Gaussian function for what I do, and the nature and level of my understanding of the normal distribution, or multiple regression, or bootstrapping is sufficient for me, for now.

Teaching statistics

I believe our StatsLC videos do help students to understand and learn statistics. I have put a lot of work into those explanations, and have received overwhelmingly positive feedback about the videos. However, that is no guarantee, as Khan Academy videos get almost sycophantic praise and I know that there are plenty of examples of poor pedagogy and even error in them. I have recently been reading from “Make it Stick”, which summarises theory based on experimental research on how people learn for recall and retention. I was delighted to find that the method we had happened upon in our little online quizzes was promoted as an effective method of reinforcing learning.

Your thoughts

This has been an enlightening exercise, and I recommend it to anyone teaching in mathematics or statistics. Read the first few chapters of a contemporary text on how to teach mathematics. Dianne Siemon et al, Teaching mathematics: foundations to middle years (2011) did it for me. Then “take a few minutes to write down your own views about the nature of mathematics, mathematics learning and mathematics teaching.” To which I add my own suggestion to think about the nature of statistics or operations research. Who knows what you will find out. Maybe you could put a few of your ideas down in the comments.

 

Mathematics teaching Rockstar – Jo Boaler

Moving around the education sector

My life in education has included being a High School maths teacher, then teaching at university for 20 years. I then made resources and gave professional development workshops for secondary school teachers. It was exciting to see the new statistics curriculum being implemented into the New Zealand schools. And now we are making resources and participating in the primary school sector. It is wonderful to learn from each level of teaching. We would all benefit from more discussion across the levels.

Educational theory and idea-promoters

My father used to say (and the sexism has not escaped me) “Never run after a woman, a bus or an educational theory, as there will be another one along soon.” Education theories have lifespans, and some theories are more useful than others. I am not a fan of “learning styles” and fear they have served many students ill. However, there are some current ideas and idea-promoters in the teaching of mathematics that I find very attractive. I will begin with Jo Boaler, and intend to introduce you over the next few weeks to Dan Meyer, Carol Dweck and the person who wrote “Making it stick.”

Jo Boaler – Click here for official information

My first contact with Jo Boaler was reading “The Elephant in the Classroom.” In this Jo points out how society is complicit in the idea of a “maths brain”. Somehow it is socially acceptable to admit or be almost defensively proud of being “no good at maths”. A major problem with this is that her research suggests that later success in life is connected to attainment in mathematics. In order to address this, Jo explores a less procedural approach to teaching mathematics, including greater communication and collaboration.

Mathematical Mindsets

It is interesting to  see the effect Jo Boaler’s recent book, “Mathematical Mindsets “, is having on colleagues in the teaching profession. The maths advisors based in Canterbury NZ are strong proponents of her idea of “rich tasks”. Here are some tweets about the book:

“I am loving Mathematical Mindsets by @joboaler – seriously – everyone needs to read this”

“Even if you don’t teach maths this book will change how you teach for ever.”

“Hands down the most important thing I have ever read in my life”

What I get from Jo Boaler’s work is that we need to rethink how we teach mathematics. The methods that worked for mathematics teachers are not the methods we need to be using for everyone. The defence “The old ways worked for me” is not defensible in terms of inclusion and equity. I will not even try to boil down her approach in this post, but rather suggest readers visit her website and read the book!

At Statistics Learning Centre we are committed to producing materials that fit with sound pedagogical methods. Our Dragonistics data cards are perfect for use in a number of rich tasks. We are constantly thinking of ways to embed mathematics and statistics tasks into the curriculum of other subjects.

Challenges of implementation

I am aware that many of you readers are not primary or secondary teachers. There are so many barriers to getting mathematics taught in a more exciting, integrated and effective way. Primary teachers are not mathematics specialists, and may well feel less confident in their maths ability. Secondary mathematics teachers may feel constrained by the curriculum and the constant assessment in the last three years of schooling in New Zealand. And tertiary teachers have little incentive to improve their teaching, as it takes time from the more valued work of research.

Though it would be exciting if Jo Boaler’s ideas and methods were espoused in their entirety at all levels of mathematics teaching, I am aware that this is unlikely – as in a probability of zero. However, I believe that all teachers at all levels can all improve, even a little at a time. We at Statistics Learning Centre are committed to this vision. Through our blog, our resources, our games, our videos, our lessons and our professional development we aim to empower all teacher to teach statistics – better! We espouse the theories and teachings explained in Mathematical Mindsets, and hope that you also will learn about them, and endeavour to put them into place, whatever level you teach at.

Do tell us if Jo Boalers work has had an impact on what you do. How can the ideas apply at all levels of teaching? Do teachers need to have a growth mindset about their own ability to improve their teaching?

Here are some quotes to leave you with:

Mathematical Mindsets Quotes

“Many parents have asked me: What is the point of my child explaining their work if they can get the answer right? My answer is always the same: Explaining your work is what, in mathematics, we call reasoning, and reasoning is central to the discipline of mathematics.”
“Numerous research studies (Silver, 1994) have shown that when students are given opportunities to pose mathematics problems, to consider a situation and think of a mathematics question to ask of it—which is the essence of real mathematics—they become more deeply engaged and perform at higher levels.”
“The researchers found that when students were given problems to solve, and they did not know methods to solve them, but they were given opportunity to explore the problems, they became curious, and their brains were primed to learn new methods, so that when teachers taught the methods, students paid greater attention to them and were more motivated to learn them. The researchers published their results with the title “A Time for Telling,” and they argued that the question is not “Should we tell or explain methods?” but “When is the best time do this?”
“five suggestions that can work to open mathematics tasks and increase their potential for learning: Open up the task so that there are multiple methods, pathways, and representations. Include inquiry opportunities. Ask the problem before teaching the method. Add a visual component and ask students how they see the mathematics. Extend the task to make it lower floor and higher ceiling. Ask students to convince and reason; be skeptical.”

All quotes from

Jo Boaler, Mathematical Mindsets: Unleashing Students’ Potential through Creative Math, Inspiring Messages and Innovative Teaching

Papamoa College statistics excursion to Hamilton Zoo

Pizza in the park

Pizza in the park

Last week I had a lovely experience. I visited the Hamilton Observatory and Zoo as part of a Statistics excursion with the Year 13 statistics class of Papamoa College.

The trip was organised to help students learn about where data comes from. I went along because I really love teachers and students, and it was an opportunity to experience innovation by a team of wonderful teachers.  The students travelled from Papamoa to Hamilton, stopping for pizza in Cambridge. When we got to the Hamilton Observatory, Dave welcomed us and gave an excellent talk about the stars and data. I found it fascinating to think how much data there is, and also the level of (in)accuracy of their measurements.  I then gave a short talk on the importance of statistics in terms of citizenship, and how the students can be successful in learning statistics. I talked about analysis of the Disney Princess movies and the Zika virus.

Turtle

My favourite animal of the day

The next morning we went over to the Hamilton Zoo for breakfast followed by a talk by Ken on the use of data in the Zoo. That too was fascinating, and got my brain whirring. Zoos these days are all about education and helping endangered species to survive. They have records of weights of all the animals over time, making for some very interesting data. Weights are used as an indication of health in the animals. Ken shared pictures of animals being weighed – including tricky keas and fantastically large rhinos. The Zoo also collects a wide range of other data, such as the visitor numbers, satisfaction surveys, quantity of waste and food consumption. We visited the food preparation area and heard how the diets are carefully worked out, and the food fed in such a way as to give the animals something to think about.

Papamoa stats class

Dr Nic and the teachers and students of Papamoa College give statistics two thumbs up!

Though most of my work these days is in the field of statistics education, a part of my heart still belongs to Operations Research. I saw so many ways in which OR could help with things such as diets, logistics etc. I’m not saying that they are doing anything wrong, but there is always room for improvement. Were I still teaching OR to graduate students I would be looking for a project with a zoo.

I am sure the students benefited from the experience of seeing first-hand the use of data in multiple contexts. I was glad to be able to meet with the students
and talk to many about the assignments they will be doing throughout the year. Each student has the opportunity to choose an application area for the multiple assessments. I was impressed with their level of motivation, which will lead to better learning outcomes.

Well done team at Papamoa!

 

What does it mean to understand statistics?

It is possible to get a passing grade in a statistics paper by putting numbers into formulas and words into memorised phrases. In fact I suspect that this is a popular way for students to make their way through a required and often unwanted subject.

Most teachers of statistics would say that they would like students to understand what they are doing. This was a common sentiment expressed by participants in the excellent MOOC, Teaching statistics through data investigations (which is currently running again in January to May 2016.)

Understanding

This makes me wonder what it means for students to understand statistics. There are many levels to understanding things. The concept of understanding has many nuances. If a person understands English, it means that they can use English with proficiency. If they are native speakers they may have little understanding of how grammar works, but they can still speak with correct grammar. We talk about understanding how a car works. I have no idea how a car works, apart from some idea that it requires petrol and the pistons go really, really fast. I can name parts of a car engine, such as distributor and drive shaft. But that doesn’t stop me from driving a car.

Understanding statistics

I propose that when we talk about teaching students to understand statistics, we want our students to know why they are doing something, and have an idea of how it works. Students also need to be fluent in the language of statistics. I would not expect any student of an introductory or high school statistics class to be able to explain how least squares regression works in terms of matrix algebra, but I would expect them to have an idea that the fitted line in a bivariate plot is a model that minimises the squared error terms. I’m not sure anyone needs to know why “degrees of freedom” are called that – or even really what degrees of freedom do. These days computer packages look after degrees of freedom for us. We DO need to understand what a p-value is, and what it is telling us. For many people it is not necessary to know how a p-value is calculated.

Ways to teach statistics

There are several approaches to teaching statistics. The approach needs to be tailored to the students and the context of the course. I prefer a hands-on, conceptual approach rather than a mathematical one. In current literature and practice there is a push for learning through investigations, often based around the statistical inquiry cycle. The problem with one long project is that students don’t get opportunities to apply principles in different situations, in such a way that will help in transfer of learning to other situations. There are some people who still teach statistics through the mathematical formulas, but I fear they are missing out on the opportunity to help students really enjoy statistics.

I do not propose to have all the answers, but we did discover one way to help students learn, alongside other methods. This approach is to use a short video, followed by a ten question true/false quiz. The quiz serves to reinforce and elaborate on concepts taught in the video, challenge students’ misconceptions, and help students be more familiar with the vocabulary and terminology of statistics. The quizzes we develop have multiple questions that randomise to give students the opportunity to try multiple times which seems to help understanding.

This short and entertaining video gives an illustration of how you can use videos and quizzes to help students learn difficult concepts.

And here is a link to a listing of all our videos and how you can get access to them. Statistics Learning Centre Videos

We have just started a newsletter letting people know of new products and hints for teaching. You can sign up here. Sign up for newsletter

Understanding Statistical Inference

Inference is THE big idea of statistics. This is where people come unstuck. Most people can accept the use of summary descriptive statistics and graphs. They can understand why data is needed. They can see that the way a sample is taken may affect how things turn out. They often understand the need for control groups. Most statistical concepts or ideas are readily explainable. But inference is a tricky, tricky idea. Well actually – it doesn’t need to be tricky, but the way it is generally taught makes it tricky.

Procedural competence with zero understanding

I cast my mind back to my first encounter with confidence intervals and hypothesis tests. I learned how to calculate them (by hand  – yes I am that old) but had not a clue what their point was. Not a single clue. I got an A in that course. This is a common occurrence. It is possible to remain blissfully unaware of what inference is all about, while answering procedural questions in exams correctly.

But, thanks to the research and thinking of a lot of really smart and dedicated statistics teachers, we are able put a stop to that. And we must. Help us make great resourcces

We need to explicitly teach what statistical inference is. Students do not learn to understand inference by doing calculations. We need to revisit the ideas behind inference frequently. The process of hypothesis testing, is counter-intuitive and so confusing that it spills its confusion over into the concept of inference. Confidence intervals are less confusing so a better intermediate point for understanding statistical inference. But we need to start with the concept of inference.

What is statistical inference?

The idea of inference is actually not that tricky if you unbundle the concept from the application or process.

The concept of statistical inference is this –

We want to know stuff about a large group of people or things (a population). We can’t ask or test them all so we take a sample. We use what we find out from the sample to draw conclusions about the population.

That is it. Now was that so hard?

Developing understanding of statistical inference in children

I have found the paper by Makar and Rubin, presenting a “framework for thinking about informal statistical inference”, particularly helpful. In this paper they summarise studies done with children learning about inference. They suggest that “ three key principles … appeared to be essential to informal statistical inference: (1) generalization, including predictions, parameter estimates, and conclusions, that extend beyond describing the given data; (2) the use of data as evidence for those generalizations; and (3) employment of probabilistic language in describing the generalization, including informal reference to levels of certainty about the conclusions drawn.” This can be summed up as Generalisation, Data as evidence, and Probabilistic Language.

We can lead into informal inference early on in the school curriculum. The key Ideas in the NZ curriculum suggest that “ teachers should be encouraging students to read beyond the data. Eg ‘If a new student joined our class, how many children do you think would be in their family?’” In other words, though we don’t specifically use the terms population and sample, we can conversationally draw attention to what we learn from this set of data, and how that might relate to other sets of data.

Explaining directly to Adults

When teaching adults we may use a more direct approach, explaining explicitly, alongside experiential learning to understanding inference. We have just completed made a video: Understanding Inference. Within the video we have presented three basic ideas condensed from the Five Big Ideas in the very helpful book published by NCTM, “Developing Essential Understanding of Statistics, Grades 9 -12”  by Peck, Gould and Miller and Zbiek.

Ideas underlying inference

  • A sample is likely to be a good representation of the population.
  • There is an element of uncertainty as to how well the sample represents the population
  • The way the sample is taken matters.

These ideas help to provide a rationale for thinking about inference, and allow students to justify what has often been assumed or taught mathematically. In addition several memorable examples involving apples, chocolate bars and opinion polls are provided. This is available for free use on YouTube. If you wish to have access to more of our videos than are available there, do email me at n.petty@statslc.com.

Please help us develop more great resources

We are currently developing exciting innovative materials to help students at all levels of the curriculum to understand and enjoy statistical analysis. We would REALLY appreciate it if any readers here today would help us out by answering this survey about fast food and dessert. It will take 10 minutes at a maximum. We don’t mind what country you are from, and will do the currency conversions.  And in a few months I will let you know how we got on. and we would love you to forward it to your friends and students to fill it out also – the more the merrier! It is an example of a well-designed questionnaire, with a meaningful purpose.

 

 

Teaching random variables and distributions

Why do we teach about random variables, and why is it so difficult to understand?

Probability and statistics go together pretty well and basic probability is included in most introductory statistics courses. Often maths teachers prefer the probability section as it is more mathematical than inference or exploratory data analysis. Both probability and statistics deal with the idea of uncertainty and chance, statistics mostly being about what has happened, and probability about what might happen. Probability can be, and often is, reduced to fun little algebraic puzzles, with little link to reality. But a sound understanding of the concept of probability and distribution, is essential to H.G. Wells’s “efficient citizen”.

When I first started on our series of probability videos, I wrote about the worth of probability. Now we are going a step further into the probability topic abyss, with random variables. For an introductory statistics course, it is an interesting question of whether to include random variables. Is it necessary for the future marketing managers of the world, the medical practitioners, the speech therapists, the primary school teachers, the lawyers to understand what a random variable is? Actually, I think it is. Maybe it is not as important as understanding concepts like risk and sampling error, but random variables are still important.

Random variables

Like many concepts in our area, once you get what a random variable is, it can be hard to explain. Now that I understand what a random variable is, it is difficult to remember what was difficult to understand about it. But I do remember feeling perplexed, trying to work out what exactly a random variable was. The lecturers use the term freely, but I remember (many decades ago) just not being able to pin down what a random variable is. And why it needed to exist.

To start with, the words “random variable” are difficult on their own. I have dedicated an entire post to the problems with “random”, and in the writing of it, discovered another inconsistency in the way that we use the word. When we are talking about a random sample, random implies equal likelihood. Yet when we talk about things happening randomly, they are not always equally likely. The word “variable” is also a problem. Surely all variables vary? Students may wonder what a non-random variable is – I know I did.

I like to introduce the idea of variables, as part of mathematical modelling. We can have a simple model:

Cost of event = hall hire + per capita charge x number of guests.

In this model, the hall hire and per capita charge are both constants, and the number of guests is a variable. The cost of the event is also a variable, and can be expressed as a function of the number of guests. And vice versa! Now if we know the number of guests, we can then calculate the cost of the event. But the number of guests may be uncertain – it could be something between 100 and 120. It is thus a random variable.

Another way to look at a random variable is to come from the other direction – start with the random part and add the variable part. When something random happens, sometimes the outcome is discrete and non-numerical, such as the sex of a baby, the colour of a tulip, or the type of fruit in a lunchbox. But when the random outcome is given a value, then it becomes a random variable.

Distributions

Pictorial representation of different distributions

Pictorial representation of different distributions

Then we come to distributions. I fear that too often distributions are taught in such a way that students believe that the normal or bell curve is a property guiding the universe, rather than a useful model that works in many different circumstances. (Rather like Adam Smith’s invisible hand that economists worship.) I’m pretty sure that is what I believed for many years, in my fog of disconnected statistical concepts. Somewhat telling, is the tendency for examples to begin with the words, “The life expectancy of a particular brand of lightbulb is normally distributed with a mean of …” or similar. Worse still, they don’t even mention the normal distribution, and simply say “The mean income per household in a certain state is $9500 with a standard deviation of $1750. The middle 95% of incomes are between what two values?” Students are left to assume that the normal distribution will apply, which in the second case is only a very poor approximation as incomes are likely to be skewed. This sloppy question-writing perpetuates the idea of the normal distribution as the rule that guides the universe.

Take a look at the textbook you use, and see what language it uses when asking questions about the normal distribution. The two examples above are from a popular AP statistics test preparation text.

I thought I’d better take a look at what Khan Academy did to random variables. I started watching the first video and immediately got hit with the flipping coin and rolling dice. No, people – this is not the way to introduce random variables! No one cares how many coins are heads. And even worse he starts with a zero/one random variable because we are only flipping one coin. And THEN he says that he could define a head as 100 and tail as 703 and…. Sorry, I can’t take it anymore.

A good way to introduce random variables

After LOTS of thinking and explaining, and trying stuff out, I have come up with what I think is a revolutionary and fabulous way to introduce random variables and distributions. To begin with we use a discrete empirical distribution to illustrate the idea of a random variable. The random variable models the number of ice creams per customer.

Then we use that discrete distribution to teach about expected value and standard deviation, and combining random variables.The third video introduces the idea of families of distributions, and shows how different distributions can be used to model the same random process.

Another unusual feature, is the introduction of the triangular distribution, which is part of the New Zealand curriculum. You can read here about the benefits of teaching the triangular distribution.

I’m pretty excited about this approach to teaching random variables and distributions. I’d love some feedback about it!

Introducing Probability

I have a guilty secret. I really love probability problems. I am so happy to be making videos about probability just now, and conditional probability and distributions and all that fun stuff. I am a little disappointed that we won’t be doing decision trees with Bayesian review, calculating EVPI. That is such fun, but I gave up teaching that some years ago.

The reason probability is fun is because it is really mathematics, and puzzles and logic. I love permutations and combinations too – there is something cool about working out how many ways something can happen.

So why should I feel guilty? Well, in all honesty I have to admit that there is very little need for most of that in a course about statistics at high-school or entry level university. When I taught statistical methods for management, we did some probability, but only from an applied viewpoint, and we never touched intersection and union signs or anything like that. We applied some distributions, but without much theoretical underpinning.

The GAISE (Guidelines for Assessment and Instruction in Statistics Education) Report says, “Teachers and students must understand that statistics and probability are not the same. Statistics uses probability, much as physics uses calculus.”

The question is, why do we teach probability – apart from the fact that it’s fun and makes a nice change from writing reports on time series and bivariate analysis, inference and experiments. The GAISE report also says, “Probability is an important part of any mathematical education. It is a part of mathematics that enriches the subject as a whole by its interactions with other uses of mathematics. Probability is an essential tool in applied mathematics and mathematical modeling. It is also an essential tool in statistics.”

The concept of probability is as important as it is misunderstood. It is vital to have an understanding of the nature of chance and variation in life, in order to be a well-informed, (or “efficient”) citizen. One area in which this is extremely important is in understanding risk and relative risk. When a person is told that their chances of dying of some rare disease have just doubled, it is important that they know that it may be because they have gone from one chance in a million to two chances in a million. Sure it has doubled, but it still is pretty trivial. An understanding of probability is also important in terms of gambling and resistance to the allures of games of chance. And more socially acceptable gambling, such as stockmarket trading, also requires an understanding of chance and variation.

The concept of probability is important, and a few rules of probability may help with understanding, but I suspect the mathematicians get carried away and create problems that are unlikely (probability close to zero) to ever occur in reality. Anything requiring a three-way Venn Diagram has moved from applied problem to logic puzzle.This is in stark contrast to the very applied data-driven approach used in teaching statistics in New Zealand.

Teaching Probability

The traditional approach to teaching probability is to start with the coin and the dice and the balls in the urns. As well as being mind-bogglingly boring and pointless, this also projects an artificial certainty about the probabilities, which is confusing when we start discussing models. If you look at the Khan Academy videos (but don’t) you will find trivial examples about coloured balls or sweets or strangely complex problems involving hitting a circular target. The traditional approach is also to teach probability as truth. “The probability of getting a boy is one-half”. What does that even mean?

I am currently reading the new Springer volume, Probabilistic Thinking, and intend to write a review and post it on this blog, if I can get through enough before my review copy expires. It is inspiring and surprisingly gripping (but I don’t think that is enough of a review to earn me a hard copy to keep.). There are many great ideas for teaching in it, that I hope to pass on in due time.

The New Zealand approach to teaching probability comes from a modelling perspective, right from the start. At level 1, the first two years of schooling, children are exploring chance situations, playing games with a chance element and describing possible outcomes. By years 5 and 6 they are assigning numeric values to the likelihood of an occurrence. They (in the curriculum) are being introduced to model estimates and experimental estimates of probability. Bearing in mind how difficult high school maths teachers are finding the new approach, I don’t have a lot of confidence that the primary teachers are equipped yet to make the philosophical changes, let alone enact them in the classroom.

Teaching Confidence Intervals

If you want your students to understand just two things about confidence intervals, what would they be?

What and what order

When making up a teaching plan for anything it is important to think about whom you are teaching, what it is you want them to learn, and what order will best achieve the most important desired outcomes. In my previous life as a university professor I mostly taught confidence intervals to business students, including MBAs. Currently I produce materials to help teach high school students. When teaching business students, I was aware that many of them had poor mathematics skills, and I did not wish that to get in the way of their understanding. High School students may well be more at home with formulas and calculations, but their understanding of the outside world is limited. Consequently the approaches for these two different students may differ.

Begin with the end in mind

I use the “all of the people, some of the time” principle when deciding on the approach to use in teaching a topic. Some of the students will understand most of the material, but most of the students will only really understand some of the material, at least the first time around. Statistics takes several attempts before you approach fluency. Generally the material students learn will be the material they get taught first, before they start to get lost. Therefore it is good to start with the important material. I wrote a post about this, suggesting starting at the very beginning is not always the best way to go. This is counter-intuitive to mathematics teachers who are often very logical and wish to take the students through from the beginning to the end.

At the start I asked this question – if you want your students to understand just two things about confidence intervals, what would they be?

To me the most important things to learn about confidence intervals are what they are and why they are needed. Learning about the formula is a long way down the list, especially in these days of computers.

The traditional approach to teaching confidence intervals

A traditional approach to teaching confidence intervals is to start with the concept of a sampling distribution, followed by calculating the confidence interval of a mean using the Z distribution. Then the t distribution is introduced. Many of the questions involve calculation by formula. Very little time is spent on what a confidence interval is and why we need them. This is the order used in many textbooks. The Khan Academy video that I reviewed in a previous post does just this.

A different approach to teaching confidence intervals

My approach is as follows:
Start with the idea of a sample and a population, and that we are using a sample to try to find out an unknown value from the population. Show our video about understanding a confidence interval. One comment on this video decried the lack of formulas. I’m not sure what formulas would satisfy the viewer, but as I was explaining what a confidence interval is, not how to get it, I had decided that formulas would not help.

The new New Zealand school curriculum follows a process to get to the use of formal confidence intervals. Previously the assessment was such that a student could pass the confidence interval section by putting values into formulas in a calculator. In the new approach, early high school students are given real data to play with, and are encouraged to suggest conclusions they might be able to draw about the population, based on the sample. Then in Year 12 they start to draw informal confidence intervals, based on the sample.
Then in Year 13, we introduce bootstrapping as an intuitively appealing way to calculate confidence intervals. Students use existing data to draw a conclusion about two medians.
In a more traditional course, you could instead use the normal-based formula for the confidence interval of a mean. We now have a video for that as well.

You could then examine the idea of the sampling distribution and the central limit theorem.

The point is that you start with getting an idea of what a confidence interval is, and then you find out how to find one, and then you start to find out the theory underpinning it. You can think of it as successive refinement. Sometimes when we see photos downloading onto a device, they start off blurry, and then gradually become clearer as we gain more information. This is a way to learn a complex idea, such as confidence intervals. We start with the big picture, and not much detail, and then gradually fill out the details of the how and how come of the calculations.

When do we teach the formulas?

Some teachers believe that the students need to know the formulas in order to understand what is going on. This is probably true for some students, but not all. There are many kinds of understanding, and I prefer a conceptual and graphical approaches. If formulas are introduced at the end of the topic, then the students who like formulas are satisfied, and the others are not alienated. Sometimes it is best to leave the vegetables until last! (This is not a comment on the students!)

For more ideas about teaching confidence intervals see other posts:
Good, bad and wrong videos about confidence intervals
Confidence Intervals: informal, traditional, bootstrap
Why teach resampling