Engaging students in learning statistics using The Islands.

Three Problems and a Solution

Modern teaching methods for statistics have gone beyond the mathematical calculation of trivial problems. Computers can enable large size studies, bringing reality to the subject, but this is not without its own problems.

Problem 1: Giving students experience of the whole statistical process

There are many reasons for students to learn statistics through running their own projects, following the complete statistical enquiry process, posing a problem, planning the data collection, collecting and cleaning the data, analysing the data and drawing conclusions that relate back to the original problem. Individual projects can be both time consuming and risky, as the quality of the report, and the resultant grade can be dependent on the quality of the data collected, which may be beyond the control of the student.

The Statistical Enquiry Cycle, which underpins the NZ statistics curriculum.

The Statistical Enquiry Cycle, which underpins the NZ statistics curriculum.

Problem 2: Giving students experience of different types of sampling

If students are given an existing database and then asked to sample from it, this can be confusing for student and sends the misleading message that we would not want to use all the data available. But physically performing a sample, based on a sampling frame, can be prohibitively time consuming.

Problem 3: Giving students experience conducting human experiments

The problem here is obvious. It is not ethical to perform experiments on humans simply to learn about performing experiments.

An innovative solution: The Islands virtual world.

I recently ran an exciting workshop for teachers on using The Islands. My main difficulty was getting the participants to stop doing the assigned tasks long enough to discuss how we might implement this in their own classrooms. They were too busy clicking around different villages and people, finding subjects of the right age and getting them to run down a 15degree slope – all without leaving the classroom.

The Island was developed by Dr Michael Bulmer from the University of Queensland and is a synthetic learning environment. The Islands, the second version, is a free, online, virtual human population created for simulating data collection.

The synthetic learning environment overcomes practical and ethical issues with applied human research, and is used for teaching students at many different levels. For a login, email james.baglin @ rmit.edu.au (without the spaces in the email address).

There are now approximately 34,000 inhabitants of the Islands, who are born, have families (or not) and die in a speeded up time frame where 1 Island year is equivalent to about 28 earth days. They each carry a genetic code that affects their health etc. The database is dynamic, so every student will get different results from it.

The Islanders

Some of the Islanders

Two magnificent features

To me the one of the two best features is the difficulty of acquiring data on individuals. It takes time for students to collect samples, as each subject must be asked individually, and the results recorded in a database. There is no easy access to the population. This is still much quicker than asking people in real-life (or “irl” as it is known on the social media.) It is obvious that you need to sample and to have a good sampling plan, and you need to work out how to record and deal with your data.

The other outstanding feature is the ability to run experiments. You can get a group of subjects and split them randomly into treatment and control groups. Then you can perform interventions, such as making them sit quietly or run about, or drink something, and then evaluate their performance on some other task. This is without requiring real-life ethical approval and informed consent. However, in a touch of reality the people of the Islands sometimes lie, and they don’t always give consent.

There are over 200 tasks that you can assign to your people, covering a wide range of topics. They include blood tests, urine tests, physiology, food and drinks, injections, tablets, mental tasks, coordination, exercise, music, environment etc. The tasks occur in real (reduced) time, so you are not inclined to include more tasks than are necessary. There is also the opportunity to survey your Islanders, with more than fifty possible questions. These also take time to answer, which encourages judicious choice of questions.

Uses

In the workshop we used the Islands to learn about sampling distributions. First each teacher took a sample of one male and one female and timed them running down a hill. We made (fairly awful) dotplots on the whiteboard using sticky notes with the individual times on them. Then each teacher took a sample and found the median time. We used very small samples of 7 each as we were constrained by time, but larger samples would be preferable. We then looked at the distributions of the medians and compared that with the distribution of our first sample. The lesson was far from polished, but the message was clear, and it gave a really good feel for what a sampling distribution is.

Within the New Zealand curriculum, we could also use The Islands to learn about bivariate relationships, sampling methods and randomised experiments.

In my workshop I had educators from across the age groups, and a primary teacher assured me that Year 4 students would be able to make use of this. Fortunately there is a maturity filter so that you can remove options relating to drugs and sexual activity.

James Baglin from RMIT University has successfully trialled the Island with high school students and psychology research methods students. The owners of the Island generously allow free access to it. Thanks to James Baglin, who helped me prepare this post.

Here are links to some interesting papers that have been written about the use of The Islands in teaching. We are excited about the potential of this teaching tool.

Michael Bulmer and J. Kimberley Haladyn (2011) Life on an Island: a simulated population to support student projects in statistics. Technology Innovations in Statistics Education, 5(1). 

Huynh, Baglin, Bedford (2014) Improving the attitudes of high school students towards statistics: An Island-based approach. ICOTS9

Baglin, Reece, Bulmer and Di Benedetto, (2013) Simulating the data investigative cycle in less than two hours: using a virtual human population, cloud collaboration and a statistical package to engage students in a quantitative research methods course.

Bulmer, M. (2010). Technologies for enhancing project assessment in large classes. In C. Reading (Ed.), Proceedings of the Eighth International Conference on Teaching Statistics, July 2010. Ljubljana, Slovenia. Retrieved from http://www.stat.auckland.ac.nz/~iase/publications/icots8/ICOTS8_5D3_BULMER.pdf

Bulmer, M., & Haladyn, J. K. (2011). Life on an Island: A simulated population to support student projects in statistics. Technology Innovations in Statistics Education, 5. Retrieved from http://escholarship.org/uc/item/2q0740hv

Baglin, J., Bedford, A., & Bulmer, M. (2013). Students’ experiences and perceptions of using a virtual environment for project-based assessment in an online introductory statistics course. Technology Innovations in Statistics Education, 7(2), 1–15. Retrieved from http://www.escholarship.org/uc/item/137120mt

Don’t teach significance testing – Guest post

The following is a guest post by Tony Hak of Rotterdam School of Management. I know Tony would love some discussion about it in the comments. I remain undecided either way, so would like to hear arguments.

GOOD REASONS FOR NOT TEACHING SIGNIFICANCE TESTING

It is now well understood that p-values are not informative and are not replicable. Soon null hypothesis significance testing (NHST) will be obsolete and will be replaced by the so-called “new” statistics (estimation and meta-analysis). This requires that undergraduate courses in statistics now already must teach estimation and meta-analysis as the preferred way to present and analyze empirical results. If not, then the statistical skills of the graduates from these courses will be outdated on the day these graduates leave school. But it is less evident whether or not NHST (though not preferred as an analytic tool) should still be taught. Because estimation is already routinely taught as a preparation for the teaching of NHST, the necessary reform in teaching will not require the addition of new elements in current programs but rather the removal of the current emphasis on NHST or the complete removal of the teaching of NHST from the curriculum. The current trend is to continue the teaching of NHST. In my view, however, teaching of NHST should be discontinued immediately because it is (1) ineffective and (2) dangerous, and (3) it serves no aim.

1. Ineffective: NHST is difficult to understand and it is very hard to teach it successfully

We know that even good researchers often do not appreciate the fact that NHST outcomes are subject to sampling variation and believe that a “significant” result obtained in one study almost guarantees a significant result in a replication, even one with a smaller sample size. Is it then surprising that also our students do not understand what NHST outcomes do tell us and what they do not tell us? In fact, statistics teachers know that the principles and procedures of NHST are not well understood by undergraduate students who have successfully passed their courses on NHST. Courses on NHST fail to achieve their self-stated objectives, assuming that these objectives include achieving a correct understanding of the aims, assumptions, and procedures of NHST as well as a proper interpretation of its outcomes. It is very hard indeed to find a comment on NHST in any student paper (an essay, a thesis) that is close to a correct characterization of NHST or its outcomes. There are many reasons for this failure, but obviously the most important one is that NHST a very complicated and counterintuitive procedure. It requires students and researchers to understand that a p-value is attached to an outcome (an estimate) based on its location in (or relative to) an imaginary distribution of sample outcomes around the null. Another reason, connected to their failure to understand what NHST is and does, is that students believe that NHST “corrects for chance” and hence they cannot cognitively accept that p-values themselves are subject to sampling variation (i.e. chance)

2. Dangerous: NHST thinking is addictive

One might argue that there is no harm in adding a p-value to an estimate in a research report and, hence, that there is no harm in teaching NHST, additionally to teaching estimation. However, the mixed experience with statistics reform in clinical and epidemiological research suggests that a more radical change is needed. Reports of clinical trials and of studies in clinical epidemiology now usually report estimates and confidence intervals, in addition to p-values. However, as Fidler et al. (2004) have shown, and contrary to what one would expect, authors continue to discuss their results in terms of significance. Fidler et al. therefore concluded that “editors can lead researchers to confidence intervals, but can’t make them think”. This suggests that a successful statistics reform requires a cognitive change that should be reflected in how results are interpreted in the Discussion sections of published reports.

The stickiness of dichotomous thinking can also be illustrated with the results of a more recent study of Coulson et al. (2010). They presented estimates and confidence intervals obtained in two studies to a group of researchers in psychology and medicine, and asked them to compare the results of the two studies and to interpret the difference between them. It appeared that a considerable proportion of these researchers, first, used the information about the confidence intervals to make a decision about the significance of the results (in one study) or the non-significance of the results (of the other study) and, then, drew the incorrect conclusion that the results of the two studies were in conflict. Note that no NHST information was provided and that participants were not asked in any way to “test” or to use dichotomous thinking. The results of this study suggest that NHST thinking can (and often will) be used by those who are familiar with it.

The fact that it appears to be very difficult for researchers to break the habit of thinking in terms of “testing” is, as with every addiction, a good reason for avoiding that future researchers come into contact with it in the first place and, if contact cannot be avoided, for providing them with robust resistance mechanisms. The implication for statistics teaching is that students should, first, learn estimation as the preferred way of presenting and analyzing research information and that they get introduced to NHST, if at all, only after estimation has become their routine statistical practice.

3. It serves no aim: Relevant information can be found in research reports anyway

Our experience that teaching of NHST fails its own aims consistently (because NHST is too difficult to understand) and the fact that NHST appears to be dangerous and addictive are two good reasons to immediately stop teaching NHST. But there is a seemingly strong argument for continuing to introduce students to NHST, namely that a new generation of graduates will not be able to read the (past and current) academic literature in which authors themselves routinely focus on the statistical significance of their results. It is suggested that someone who does not know NHST cannot correctly interpret outcomes of NHST practices. This argument has no value for the simple reason that it is assumed in the argument that NHST outcomes are relevant and should be interpreted. But the reason that we have the current discussion about teaching is the fact that NHST outcomes are at best uninformative (beyond the information already provided by estimation) and are at worst misleading or plain wrong. The point is all along that nothing is lost by just ignoring the information that is related to NHST in a research report and by focusing only on the information that is provided about the observed effect size and its confidence interval.

Bibliography

Coulson, M., Healy, M., Fidler, F., & Cumming, G. (2010). Confidence Intervals Permit, But Do Not Guarantee, Better Inference than Statistical Significance Testing. Frontiers in Quantitative Psychology and Measurement, 20(1), 37-46.

Fidler, F., Thomason, N., Finch, S., & Leeman, J. (2004). Editors Can Lead Researchers to Confidence Intervals, But Can’t Make Them Think. Statistical Reform Lessons from Medicine. Psychological Science, 15(2): 119-126.

This text is a condensed version of the paper “After Statistics Reform: Should We Still Teach Significance Testing?” published in the Proceedings of ICOTS9.

 

The Myth of Random Sampling

I feel a slight quiver of trepidation as I begin this post – a little like the boy who pointed out that the emperor has  no clothes.

Random sampling is a myth. Practical researchers know this and deal with it. Theoretical statisticians live in a theoretical world where random sampling is possible and ubiquitous – which is just as well really. But teachers of statistics live in a strange half-real-half-theoretical world, where no one likes to point out that real-life samples are seldom random.

The problem in general

In order for most inferential statistical conclusions to be valid, the sample we are using must obey certain rules. In particular, each member of the population must have equal possibility of being chosen. In this way we reduce the opportunity for systematic error, or bias. When a truly random sample is taken, it is almost miraculous how well we can make conclusions about the source population, with even a modest sample of a thousand. On a side note, if the general population understood this, and the opportunity for bias and corruption were eliminated, general elections and referenda could be done at much less cost,  through taking a good random sample.

However! It is actually quite difficult to take a random sample of people. Random sampling is doable in biology, I suspect, where seeds or plots of land can be chosen at random. It is also fairly possible in manufacturing processes. Medical research relies on the use of a random sample, though it is seldom of the total population. Really it is more about randomisation, which can be used to support causal claims.

But the area of most interest to most people is people. We actually want to know about how people function, what they think, their economic activity, sport and many other areas. People find people interesting. To get a really good sample of people takes a lot of time and money, and is outside the reach of many researchers. In my own PhD research I approximated a random sample by taking a stratified, cluster semi-random almost convenience sample. I chose representative schools of different types throughout three diverse regions in New Zealand. At each school I asked all the students in a class at each of three year levels. The classes were meant to be randomly selected, but in fact were sometimes just the class that happened to have a teacher away, as my questionnaire was seen as a good way to keep them quiet. Was my data of any worth? I believe so, of course. Was it random? Nope.

Problems people have in getting a good sample include cost, time and also response rate. Much of the data that is cited in papers is far from random.

The problem in teaching

The wonderful thing about teaching statistics is that we can actually collect real data and do analysis on it, and get a feel for the detective nature of the discipline. The problem with sampling is that we seldom have access to truly random data. By random I am not meaning just simple random sampling, the least simple method! Even cluster, systematic and stratified sampling can be a challenge in a classroom setting. And sometimes if we think too hard we realise that what we have is actually a population, and not a sample at all.

It is a great experience for students to collect their own data. They can write a questionnaire and find out all sorts of interesting things, through their own trial and error. But mostly students do not have access to enough subjects to take a random sample. Even if we go to secondary sources, the data is seldom random, and the students do not get the opportunity to take the sample. It would be a pity not to use some interesting data, just because the collection method was dubious (or even realistic). At the same time we do not want students to think that seriously dodgy data has the same value as a carefully collected random sample.

Possible solutions

These are more suggestions than solutions, but the essence is to do the best you can and make sure the students learn to be critical of their own methods.

Teach the best way, pretend and look for potential problems.

Teach the ideal and also teach the reality. Teach about the different ways of taking random samples. Use my video if you like!

Get students to think about the pros and cons of each method, and where problems could arise. Also get them to think about the kinds of data they are using in their exercises, and what biases they may have.

We also need to teach that, used judiciously, a convenience sample can still be of value. For example I have collected data from students in my class about how far they live from university , and whether or not they have a car. This data is not a random sample of any population. However, it is still reasonable to suggest that it may represent all the students at the university – or maybe just the first year students. It possibly represents students in the years preceding and following my sample, unless something has happened to change the landscape. It has worth in terms of inference. Realistically, I am never going to take a truly random sample of all university students, so this may be the most suitable data I ever get.  I have no doubt that it is better than no information.

All questions are not of equal worth. Knowing whether students who own cars live further from university, in general, is interesting but not of great importance. Were I to be researching topics of great importance, such safety features in roads or medicine, I would have a greater need for rigorous sampling.

So generally, I see no harm in pretending. I use the data collected from my class, and I say that we will pretend that it comes from a representative random sample. We talk about why it isn’t, but then we move on. It is still interesting data, it is real and it is there. When we write up analysis we include critical comments with provisos on how the sample may have possible bias.

What is important is for students to experience the excitement of discovering real effects (or lack thereof) in real data. What is important is for students to be critical of these discoveries, through understanding the limitations of the data collection process. Consequently I see no harm in using non-random, realistic sampled real data, with a healthy dose of scepticism.

Deterministic and Probabilistic models and thinking

The way we understand and make sense of variation in the world affects decisions we make.

Part of understanding variation is understanding the difference between deterministic and probabilistic (stochastic) models. The NZ curriculum specifies the following learning outcome: “Selects and uses appropriate methods to investigate probability situations including experiments, simulations, and theoretical probability, distinguishing between deterministic and probabilistic models.” This is at level 8 of the curriculum, the highest level of secondary schooling. Deterministic and probabilistic models are not familiar to all teachers of mathematics and statistics, so I’m writing about it today.

Model

The term, model, is itself challenging. There are many ways to use the word, two of which are particularly relevant for this discussion. The first meaning is “mathematical model, as a decision-making tool”. This is the one I am familiar with from years of teaching Operations Research. The second way is “way of thinking or representing an idea”. Or something like that. It seems to come from psychology.

When teaching mathematical models in entry level operations research/management science we would spend some time clarifying what we mean by a model. I have written about this in the post, “All models are wrong.”

In a simple, concrete incarnation, a model is a representation of another object. A simple example is that of a model car or a Lego model of a house. There are aspects of the model that are the same as the original, such as the shape and ability to move or not. But many aspects of the real-life object are missing in the model. The car does not have an internal combustion engine, and the house has no soft-furnishings. (And very bumpy floors). There is little purpose for either of these models, except entertainment and the joy of creation or ownership. (You might be interested in the following video of the Lego Parisian restaurant, which I am coveting. Funny way to say Parisian!)

Many models perform useful functions. My husband works as a land-surveyor, and his work involves making models on paper or in the computer, of phenomenon on the land, and making sure that specified marks on the model correspond to the marks placed in the ground. The purpose of the model relates to ownership and making sure the sewers run in the right direction. (As a result of several years of earthquakes in Christchurch, his models are less deterministic than they used to be, and unfortunately many of our sewers ended up running the wrong way.)

Our world is full of models:

  • a map is a model of a location, which can help us get from place to place.
  • sheet music is a written model of the sound which can make a song
  • a bus timetable is a model of where buses should appear
  • a company’s financial reports are a model of one aspect of the company

Deterministic models

A deterministic model assumes certainty in all aspects. Examples of deterministic models are timetables, pricing structures, a linear programming model, the economic order quantity model, maps, accounting.

Probabilistic or stochastic models

Most models really should be stochastic or probabilistic rather than deterministic, but this is often too complicated to implement. Representing uncertainty is fraught. Some more common stochastic models are queueing models, markov chains, and most simulations.

For example when planning a school formal, there are some elements of the model that are deterministic and some that are probabilistic. The cost to hire the venue is deterministic, but the number of students who will come is probabilistic. A GPS unit uses a deterministic model to decide on the most suitable route and gives a predicted arrival time. However we know that the actual arrival time is contingent upon all sorts of aspects including road, driver, traffic and weather conditions.

Model as a way of thinking about something

The term “model” is also used to describe the way that people make sense out of their world. Some people have a more deterministic world model than others, contributed to by age, culture, religion, life experience and education. People ascribe meaning to anything from star patterns, tea leaves and moon phases to ease in finding a parking spot and not being in a certain place when a coconut falls. This is a way of turning a probabilistic world into a more deterministic and more meaningful world. Some people are happy with a probabilistic world, where things really do have a high degree of randomness. But often we are less happy when the randomness goes against us. (I find it interesting that farmers hit with bad fortune such as a snowfall or drought are happy to ask for government help, yet when there is a bumper crop, I don’t see them offering to give back some of their windfall voluntarily.)

Let us say the All Blacks win a rugby game against Australia. There are several ways we can draw meaning from this. If we are of a deterministic frame of mind, we might say that the All Blacks won because they are the best rugby team in the world.  We have assigned cause and effect to the outcome. Or we could take a more probabilistic view of it, deciding that the probability that they would win was about 70%, and that on the day they were fortunate.  Or, if we were Australian, we might say that the Australian team was far better and it was just a 1 in 100 chance that the All Blacks would win.

I developed the following scenarios for discussion in a classroom. The students can put them in order or categories according to their own criteria. After discussing their results, we could then talk about a deterministic and a probabilistic meaning for each of the scenarios.

  1. The All Blacks won the Rugby World Cup.
  2. Eri did better on a test after getting tuition.
  3. Holly was diagnosed with cancer, had a religious experience and the cancer was gone.
  4. A pet was given a homeopathic remedy and got better.
  5. Bill won $20 million in Lotto.
  6. You got five out of five right in a true/false quiz.

The regular mathematics teacher is now a long way from his or her comfort zone. The numbers have gone, along with the red tick, and there are no correct answers. This is an important aspect of understanding probability – that many things are the result of randomness. But with this idea we are pulling mathematics teachers into unfamiliar territory. Social studies, science and English teachers have had to deal with the murky area of feelings, values and ethics forever.  In terms of preparing students for a random world, I think it is territory worth spending some time in. And it might just help them find mathematics/statistics relevant!

Those who can, teach statistics

The phrase I despise more than any in popular use (and believe me there are many contenders) is “Those who can, do, and those who can’t, teach.” I like many of the sayings of George Bernard Shaw, but this one is dismissive, and ignorant and born of jealousy. To me, the ability to teach something is a step higher than being able to do it. The PhD, the highest qualification in academia, is a doctorate. The word “doctor” comes from the Latin word for teacher.

Teaching is a noble profession, on which all other noble professions rest. Teachers are generally motivated by altruism, and often go well beyond the requirements of their job-description to help students. Teachers are derided for their lack of importance, and the easiness of their job. Yet at the same time teachers are expected to undo the ills of society. Everyone “knows” what teachers should do better. Teachers are judged on their output, as if they were the only factor in the mix. Yet how many people really believe their success or failure is due only to the efforts of their teacher?

For some people, teaching comes naturally. But even then, there is the need for pedagogical content knowledge. Teaching is not a generic skill that transfers seamlessly between disciplines. You must be a thinker to be a good teacher. It is not enough to perpetuate the methods you were taught with. Reflection is a necessary part of developing as a teacher. I wrote in an earlier post, “You’re teaching it wrong”, about the process of reflection. Teachers need to know their material, and keep up-to-date with ways of teaching it. They need to be aware of ways that students will have difficulties. Teachers, by sharing ideas and research, can be part of a communal endeavour to increase both content knowledge and pedagogical content knowledge.

There is a difference between being an explainer and being a teacher. Sal Khan, maker of the Khan Academy videos, is a very good explainer. Consequently many students who view the videos are happy that elements of maths and physics that they couldn’t do, have been explained in such a way that they can solve homework problems. This is great. Explaining is an important element in teaching. My own videos aim to explain in such a way that students make sense of difficult concepts, though some videos also illustrate procedure.

Teaching is much more than explaining. Teaching includes awakening a desire to learn and providing the experiences that will help a student to learn.  In these days of ever-expanding knowledge, a content-driven approach to learning and teaching will not serve our citizens well in the long run. Students need to be empowered to seek learning, to criticize, to integrate their knowledge with their life experiences. Learning should be a transformative experience. For this to take place, the teachers need to employ a variety of learner-focussed approaches, as well as explaining.

It cracks me up, the way sugary cereals are advertised as “part of a healthy breakfast”. It isn’t exactly lying, but the healthy breakfast would do pretty well without the sugar-filled cereal. Explanations really are part of a good learning experience, but need to be complemented by discussion, participation, practice and critique.  Explanations are like porridge – healthy, but not a complete breakfast on their own.

Why statistics is so hard to teach

“I’m taking statistics in college next year, and I can’t wait!” said nobody ever!

Not many people actually want to study statistics. Fortunately many people have no choice but to study statistics, as they need it. How much nicer it would be to think that people were studying your subject because they wanted to, rather than because it is necessary for psychology/medicine/biology etc.

In New Zealand, with the changed school curriculum that gives greater focus to statistics, there is a possibility that one day students will be excited to study stats. I am impressed at the way so many teachers have embraced the changed curriculum, despite limited resources, and late changes to assessment specifications. In a few years as teachers become more familiar with and start to specialise in statistics, the change will really take hold, and the rest of the world will watch in awe.

In the meantime, though, let us look at why statistics is difficult to teach.

  1. Students generally take statistics out of necessity.
  2. Statistics is a mixture of quantitative and communication skills.
  3. It is not clear which are right and wrong answers.
  4. Statistical terminology is both vague and specific.
  5. It is difficult to get good resources, using real data in meaningful contexts.
  6. One of the basic procedures, hypothesis testing, is counter-intuitive.
  7. Because the teaching of statistics is comparatively recent, there is little developed pedagogical content knowledge. (Though this is growing)
  8. Technology is forever advancing, requiring regular updating of materials and teaching approaches.

On the other hand, statistics is also a fantastic subject to teach.

  1. Statistics is immediately applicable to life.
  2. It links in with interesting and diverse contexts, including subjects students themselves take.
  3. Studying statistics enables class discussion and debate.
  4. Statistics is necessary and does good.
  5. The study of data and chance can change the way people see the world.
  6. Technlogical advances have put the power for real statistical analysis into the hands of students.
  7. Because the teaching of statistics is new, individuals can make a difference in the way statistics is viewed and taught.

I love to teach. These days many of my students are scattered over the world, watching my videos (for free) on YouTube. It warms my heart when they thank me for making something clear, that had been confusing. I realise that my efforts are small compared to what their teacher is doing, but it is great to be a part of it.

Statistics is not beautiful (sniff)

Statistics is not really elegant or even fun in the way that a mathematics puzzle can be. But statistics is necessary, and enormously rewarding. I like to think that we use statistical methods and principles to extract truth from data.

This week many of the high school maths teachers in New Zealand were exhorted to take part in a Stanford MOOC about teaching mathematics. I am not a high school maths teacher, but I do try to provide worthwhile materials for them, so I thought I would take a look. It is also an opportunity to look at how people with an annual budget of more than 4 figures produce on-line learning materials. So I enrolled and did the first lesson, which is about people’s attitudes to math(s) and their success or trauma that has led to those attitudes. I’m happy to say that none of this was new to me. I am rather unhappy that it would be new to anyone! Surely all maths teachers know by now that how we deal with students’ small successes and failures in mathematics will create future attitudes leading to further success or failure. If they don’t, they need to take this course. And that makes me happy – that there is such a course, on-line and free for all maths teachers. (As a side note, I loved that Jo, the teacher switched between the American “math” and the British/Australian/NZ “maths”).

I’ve only done the first lesson so far, and intend to do some more, but it seems to be much more about mathematics than statistics, and I am not sure how relevant it will be. And that makes me a bit sad again. (It was an emotional journey!)

Mathematics in its pure form is about thinking. It is problem solving and it can be elegant and so much fun. It is a language that transcends nationality. (Though I have always thought the Greeks get a rough deal as we steal all their letters for the scary stuff.) I was recently asked to present an enrichment lesson to a class of “gifted and talented” students. I found it very easy to think of something mathematical to do – we are going to work around our Rogo puzzle, which has some fantastic mathematical learning opportunities. But thinking up something short and engaging and realistic in the statistics realm is much harder. You can’t do real statistics quickly.

On my run this morning I thought a whole lot more about this mathematics/statistics divide. I have written about it before, but more in defense of statistics, and warning the mathematics teachers to stay away or get with the programme. Understanding commonalities and differences can help us teach better. Mathematics is pure and elegant, and borders on art. It is the purest science. There is little beautiful about statistics. Even the graphs are ugly, with their scattered data and annoying outliers messing it all up. The only way we get symmetry is by assuming away all the badly behaved bits. Probability can be a bit more elegant, but with that we are creeping into the mathematical camp.

English Language and English literature

I like to liken. I’m going to liken maths and stats to English language and English literature. I was good at English at school, and loved the spelling and grammar aspects especially. I have in my library a very large book about the English language, (The Cambridge encyclopedia of the English Language, by David Crystal) and one day I hope to read it all. It talks about sounds and letters, words, grammar, syntax, origins, meanings. Even to dip into, it is fascinating. On the other hand I have recently finished reading “The End of Your Life Book Club” by Will Schwalbe, which is a biography of his amazing mother, set around the last two years of her life as she struggles with cancer. Will and his mother are avid readers, and use her time in treatment to talk about books. This book has been an epiphany for me. I had forgotten how books can change your way of thinking, and how important fiction is. At school I struggled with the literature side of English, as I wanted to know what the author meant, and could not see how it was right to take my own meaning from a book, poem or work of literature.  I have since discovered post-modernism and am happy drawing my own meaning.

So what does this all have to do with maths and statistics? Well I liken maths to English language. In order to be good at English you need to be able to read and write in a functional way. You need to know the mechanisms. You need to be able to DO, not just observe. In mathematics, you need to be able to approach a problem in a mathematical way.  Conversely, to be proficient in literature, you do not need to be able to produce literature. You need to be able to read literature with a critical mind, and appreciate the ideas, the words, the structure. You do need to be able to write enough to express your critique, but that is a different matter from writing a novel.  This, to me is like being statistically literate – you can read a statistical report, and ask the right questions. You can make sense of it, and not be at the mercy of poorly executed or mendacious research. You can even write a summary or a critique of a statistical analysis. But you do not need to be able to perform the actual analysis yourself, nor do you need to know the exact mathematical theory underlying it.

Statistical Literacy?

Maybe there is a problem with the term “statistical literacy”. The traditional meaning of literacy includes being able to read and write – to consume and to produce – to take meaning and to create meaning. I’m not convinced that what is called statistical literacy is the same.

Where I’m heading with this, is that statistics is a way to win back the mathematically disenfranchised. If I were teaching statistics to a high school class I would spend some time talking about what statistics involves and how it overlaps with, but is not mathematics. I would explain that even people who have had difficulty in the past with mathematics, can do well at statistics.

The following table outlines the different emphasis of the two disciplines.

Mathematics Statistics
Proficiency with numbers is important Proficiency with numbers is helpful
Abstract ideas are important Concrete applications are important
Context is to be removed so that we can model the underlying ideas Context is crucial to all statistical analysis
You don’t need to write very much. Written expression in English is important

Another idea related to this is that of “magic formulas” or the cookbook approach. I don’t have a problem with cookbooks and knitting patterns. They help me to make things I could not otherwise. However, the more I use recipes and patterns, the more I understand the principles on which they are based. But this is a thought for another day.

The importance of being wrong

We don’t like to think we are wrong

One of the key ideas in statistics is that sometimes we will be wrong. When we report a 95% confidence interval, we will be wrong 5% of the time. Or in other words, about 1 in 20 of 95% confidence intervals will not contain the population parameter we are attempting to estimate. That is how they are defined. The thing is, we always think we are part of the 95% rather than the 5%. Mostly we will be correct, but if we do enough statistical analysis, we will almost definitely be wrong at some point. However, human nature is such that we tend to think it will be someone else. There is also a feeling of blame associated with being wrong. The feeling is that if we have somehow missed the true value with our confidence interval, it must be because we have made a mistake. However, this is not true. In fact we MUST be wrong about 5% of the time, or our interval is too big, and not really a 95% confidence interval.

The term “margin of error” appears with increasing regularity as elections approach and polling companies are keen to make money out of sooth-saying. The common meaning of the margin of error is half the width of a 95% confidence interval. So if we say the margin of error is 3%, then about one time in twenty, the true value of the proportion will actually be more than 3% away from the reported sample value.

What doesn’t help is that we seldom do know if we are correct or not. If we knew the real population value we wouldn’t be estimating it. We can contrive situations where we do know the population but pretend we don’t. If we do this in our teaching, we need to be very careful to point out that this doesn’t normally happen, but does in “classroom world” only. (Thanks to MD for this useful term.) General elections can give us an idea of being right or wrong after the event, but even then the problem of non-sampling error is conflated with sampling error. When opinion polls turn out to miss the mark, we tend to think of the cause as being due to poor sampling, or people changing their minds, or all number of imaginative explanations rather than simple, unavoidable sampling error.

So how do we teach this in such a way that it goes beyond school learning and is internalised for future use as efficient citizens?

Teaching suggestions

I have two suggestions. The first is a series of True/False statements that can be used in a number of ways. I have them as part of on-line assessment, so that the students are challenged by them regularly. They could be well used in the classroom as part of a warm-up exercise at the start of a lesson. Students can write their answers down or vote using hands.

Here are some examples of True/False statements (some of which could lead to discussion):

  1. You never know if your confidence interval contains the true population value.
  2. If you make your confidence interval wide enough you can be sure that you contain the true population value.
  3. A confidence interval tells us where we are pretty sure the sample statistic lies.
  4. It is better to have a narrow confidence interval than a wide one, as it gives us more certain information, even though it is more likely to be wrong.
  5. If your study involves twenty confidence intervals, then you know that exactly one of them will be wrong.
  6. If a confidence interval doesn’t contain the true population value, it is because it is one of the 5% that was calculated incorrectly.

You can check your answers at the end of this post.

Experiential exercise

The other teaching suggestion is for an experiential exercise. It requires a little set up time.

Make a set of cards for students with numbers on them that correspond to the point estimate of a proportion, or a score that will lead to that. (Specifications for a set of 35 cards representing the results from a proportion of 0.54 and 25 trials is given below).

Introduce the exercise as follows:
“I have a computer game, and have set the ratio of wins to losses at a certain value. Each of you has played 25 times, and the number of wins you have obtained will be on your card. It is really important that you don’t look at other people’s cards.”

Hand them out to the students. (If you have fewer than 35 in your class, it might be a good idea to make sure you include the cards with 8 and 19 in the set you use – sometimes it is ok to fudge slightly to teach a point.)
“Without getting information from anyone else, write down your best estimate of the true proportion of wins to losses in the game. Do you think you are correct? How close do you think you are to the true value?”

They will need to divide the number of wins by 25, which should not lead to any computational errors! The point is that they really can’t know how close their estimate is to the true value – and what does “correct” mean?

Then work out the margin of error for a sample of size 25, which in this case is estimated at 20%. Get the students to calculate their 95% confidence intervals, and decide if they have the interval that contains the true population value. Get them to commit one way or the other.

Now they can talk to each other about the values they have.

There are several ways you can go from here. You can tell them what the population proportion was from which the numbers were drawn (0.54). They can then see that most of them had confidence intervals that included the true value, and some didn’t. Or you can leave them wondering, which is a better lesson about real life. Or you can do one exercise where you do tell them and one where you don’t.

This is an area where probability and statistics meet. You could make a nice little binomial distribution problem out of being correct in a number of confidence intervals. There are potential problems with independence, so you need to be a bit careful with the wording. For example: Fifteen  students undertake separate statistical analyses on the topics of their choice, and construct 95% confidence intervals. What is the probability that all the confidence intervals are correct, in that they do contain the estimated population parameter? This is well modelled by a binomial distribution with n =15 and p=0.05. P(X=0)=0.46. And another interesting idea – what is the probability that two or more are incorrect? 0.17 is the answer. So there is a 17% chance that more than one of the confidence intervals does not contain the population parameter of interest.

This is an area that needs careful teaching, and I suspect that some teachers have only a sketchy understanding of the idea of confidence intervals and margins of error. It is so important to know that statistical results are meant to be wrong some of the time.

Answers: T,T,F, debatable, F,F.

Data for the 35 cards:

Number on card

8

9

10

11

12

13

14

15

16

17

18

19

Number of cards

1

1

2

3

5

5

6

5

3

2

1

1

A dearth of raw data

The desired outcome of this post is to be proved wrong.

Here is my assertion: It is really difficult to find appropriate sets of data to use for teaching and assessing statistical analysis.

This is a problem; one of the key factors in teaching statistics effectively is to use real data. I have written about the need for real data (not faked) in my post Stop faking it, data should be real. I’d like to apologise here and now for my arrogant assertion that “The internet abounds with data. We can just about drown in it.” I feel like the ancient mariner staring at the data abounding, with no drop fit to drink, let alone drown in.

Recently a teacher contacted me to help her find a set of data for an assessment task in Year 13 statistics. The data set needs to have the following characteristics:

  • It must be real
  • A sample (not a population)
  • Multivariate so that the students have a choice of variables to model
  • Have at least one variable of interval/ratio data
  • Have at least one way of dividing the sample into two groups
  • It should not be a set that has previously been used for assessment in the public domain in New Zealand.
  • It should be of interest to the students
  • It should be open to background research
  • Ideally it should be randomly sampled
  • It should preferably be from New Zealand (Australia is near enough), and not too old.

How hard could that be? ( I joke of course – it is very hard)

I fancy I am pretty good at ferretting things out on the internet, but though I found wonderful sites with lots of sets of data, I could not find one set to fit the criteria. And the problem is, this will need to happen every year in every school in New Zealand, often more than once.

This is not a unique problem, I suspect. When I taught at university I was challenged to come up with appropriate data sets each year for assessment exercises. Consequently we would sometimes rotate data sets in a three year cycle, or (oh the shame) make fake data.

All over the world people are collecting data and doing analysis. Why is it so difficult to find raw data?

One issue is that of privacy – in New Zealand we have strict laws with regard to privacy and informed consent, which means that it is easier to keep the data hidden rather than try to anonymise it for general consumption. Surely that is not the case in non-human research, though. It takes a bit of work to make data available, and academics and researchers do not have time to spare. Some data is commercially sensitive, forbidding its release to the public domain. Often what look like promising data sets are not at a unit level, but a summarised into tables for the reader.

I went searching for links to data sets, and found the following. So I guess there is data out there, but it is time-consuming to find appropriate sets. And very little of it relates to NZ, sadly. And baseball, basketball and medical sets abound.

http://www.statsci.org/datasets.html looks promising, and I am grateful for the efforts. However very few of the sets meet the criteria.

http://iase-web.org/Links.php?p=Datasets has links to other sources

http://www.amstat.org/publications/jse/jse_data_archive.htm This one has the most informative layout, in terms of finding out whether the data base is likely to be useful.

So in a way I have proved myself wrong already. There are datasets out there. But difficult to find one that is just right! I feel for teachers having to trawl through so many sites to find something, though.I had hoped that there would be sets of data along with PhD thesis dissertations, but even in the area of statistics education, I couldn’t find any.

I don’t have an answer to this problem. As a uni lecturer I solved it for my own class by collecting data from them, pretending that it was a random sample of first year university students, and giving it back to them  to play with. Obviously not ideal, but fun!

Please share suggestions in the comments.

Probability and Deity

Our perception of chance affects our worldview

There are many reasons that I am glad that I majored in Operations Research rather than mathematics or statistics. My view of the world has been affected by the OR way of thinking, which combines hard and soft aspects. Hard aspects are the mathematics and the models, the stuff of the abstract world. Soft aspects relate to people and the reality of the concrete world.  It is interesting that concrete is soft! Operations Research uses a combination of approaches to aid in decision making.

My mentor was Hans Daellenbach, who was born and grew up in Switzerland, did his postgrad study in California, and then stepped back in time several decades to make his home in Christchurch, New Zealand. Hans was ahead of his time in so many ways. The way I am writing about today was his teaching about probability and our subjective views on the likelihood of events.

Thanks to Daniel Kahneman’s publishing and 2002 Nobel prize, the work by him and Amos Tversky is reaching into the popular realm and is even in the high school mathematics curriculum, in a diluted form. Hans Daellenbach required first year students to read a paper by Tversky and Kahneman in the late 1980’s, over a decade earlier. This was not popular, either with the students or the tutors who were trying to make sense of the paper. Eventually we made up some interesting exercises in tutorials, and structured the approach enough for students to catch on. (Sometimes nearly half our students were from a non-English speaking background, so reading the paper was extremely challenging for them.) As a tutor and later a lecturer, I internalised the thinking, and it changed the way I see the world and chance.

People’s understanding of probability and chance events has an impact on how they see the world as well as the decisions they make.

For example, Kahneman introduced the idea of the availability heuristic. This means that if someone we know has been affected by a catastrophic (or wonderful) unlikely event, we will perceive the possibility of such an event as more likely. For example if someone we know has had their house broken into, then we feel less secure, as we perceive the likelihood of that as increased.  Someone we know wins the lottery, and suddenly it seems possible for us. Nothing has changed in the world, but our perception has changed.

There is another easily understood concept of confirmation bias. We notice and remember events and sequences of events that reinforce or confirm our preconceived notions. “Bad things come in threes” is a wonderful example. Something bad or two things bad happen, so we look for or wait for the third, and then stop counting. Similarly we remember the times when our lucky number is lucky, and do not remember the unlucky times. We mentally record the times our hunches pay off, and quietly forget the times they don’t.

So how does this affect us as teachers of statistics? Are there ethical issues involved in how we teach statistics?

I believe in God and I believe that He guides me in my decisions in life. However I do not perceive God as a “micro-manager”. I do not believe that he has time in his day to help me to find carparks, and to send me to bargains in the supermarket. I may be wrong, and I am prepared to be proven wrong, but this is my current belief. There are many people who believe in God (or in that victim-blaming book, “The Secret”), who would disagree with me. When they see good things happen, they attribute them to the hand of God, or karma or The Secret.  There are people in some cultures who do not believe in chance at all. Everything occurs as God’s will, hence the phrase, “ insha’Allah”, or “God willing”. If they are delayed in traffic, or run into a friend, or lose their job, it is because God willed it so. This is undoubtedly a simplistic explanation, but you get the idea.

Now along comes the statistics teacher and teaches probability.  Mathematically there are some things for which the probability is easily modelled. Dice, cards, counters, balls in urns, socks in drawers can all have their probability modelled, using the ratio of number of chosen events over number of possible events. There are also probabilities estimated using historic frequencies, and there are subjective estimates of probabilities. Tversky and Kahnemann’s work showed how flawed humans are at the subjective estimates.

For some (most?) students probability remains “school-knowledge” and makes no difference to their behaviour and view of the world. It is easy to see this on game-shows such as “Deal or No Deal”, my autistic son’s favourite. It is clear that except for the decision to take the deal or not, there is no skill whatsoever in this game. In the Australian version, members of the audience hold the different cases and can guess what their case holds. If they get it right they win $500. When this happens they are praised – well done! When the main player is choosing cases, he or she is warned that they will need to be careful to avoid the high value cases. This is clearly impossible, as there is no way of knowing which cases contain which values. Yet they are praised, “Well done!” for cases that contain low values. Sometimes they even ask the audience members what they think they are holding in the case. This makes for entertaining television – with loud shouting at times to “Take the Deal!”. But it doesn’t imbue me with any confidence that people understand probability.

Having said that, I know that I act irrationally as well. In the 1990s there were toys called Tamagotchis which were electronic pets. To keep your pet happy you had to “play” with it, which involved guessing which way the pet would turn. I KNEW that it made NO difference which way I chose and that I would do just as well by always choosing the same direction. Yet when the pet had turned to the left four times in succession, I would choose turning to the right. Assuming a good random number generator in the pet, this was pointless. But it also didn’t matter!

So if I, who have a fairly sound understanding of probability distributions and chance, still think about which way my tamagotchi is going to turn, I suspect truly rational behaviour in the general populace with regard to probabilistic events is a long time coming! Astrologers, casinos, weather forecasters, economists, lotteries and the like will never go broke.

However there are other students for whom a better understanding of the human tendency to find patterns, and confirm beliefs could provide a challenge. Their parents may strongly believe that God intervenes often or that there is no uncertainty, only lack of knowledge. (In a way this is true, but that’s a topic for another day) Like the child who has just discovered the real source of Christmas bounty, probability models are something to ponder, and can be disturbing.

We do need to be sensitive in how we teach probability. Not only can we shake people’s beliefs, but we can also use insensitive examples. I used to joke about how car accidents are a poisson process with batching, which leads to a very irregular series. Then for the last two and a half years I have been affected by the Christchurch earthquakes.  I have no sense of humour when it comes to earthquakes. None of us do. When I saw in a textbook an example of probability a building falling down as a result of an earthquake, I found that upsetting. A friend was in such a building and, though she physically survived it will be a long time before she will have a full recovery, if ever. Since then I have never used earthquakes as an example of a probabilistic event when teaching in Christchurch. I also refrain as far as possible from using other examples that may stir up pain, or try to treat them in a sober manner. Breast cancer, car accidents and tornadoes kill people and may well have affected our pupils. Just a thought.

Teaching statistical report-writing

Teaching how to write statistical reports

It is difficult to write statistical reports and it is difficult to teach how to write statistical reports.

When statistics is taught in the traditional way, with emphasis on the underlying mathematics the process of statistics is truncated at both ends. When we concentrate on the sterile analysis, the messy “writing stuff” is avoided. Students do not devise their own investigative questions, and they do not write up the results.

Here’s the thing though – in reality, the analysis step of a statistical investigation is a very small part of the whole, and performed at the click of a button or two.

Ultimately the embedding of the analysis back into an investigation should not be a problem. The really interesting part of statistics happens all around the analysis. Understanding the context enriches the learning, transforming the discipline from mathematics to statistics. We can help students embrace the excitement of a true statistical investiation. But in this time of transition, the report-writing aspects are a problem. They are a problem for the learner and for the teacher.

The new New Zealand curriculum for statistics requires report-writing as an essential component of the majority of assessment, particularly at the final year of high school. This is causing understandable concern among teachers, who come predominantly from a mathematical background. I can imagine myself a few years ago saying. “I became a maths teacher so I wouldn’t have to teach and mark essays!” In addition the results from the students are less than stellar, even from capable students. Teachers do not like their students to perform poorly.

All statistics courses should have a component of report-writing, unless they are courses in the mathematics of statistics. The problem here is, like the secondary school teachers in New Zealand, many statistics instructors are dealing with the mathematics more than the application of statistics, and are not confident of their own ability at report-writing themselves. Normal human behaviour is to avoid it. Having taught service statistics courses in a business school for two decades, I have gradually made the transition to more emphasis on report-writing and am convinced that statistical report-writing needs to be taught explicitly, and taught well.

Report-writing is a fundamental and useful skill

For teachers who are uncomfortable with teaching and marking reports, it would be nice to dismiss the process of report-writing  as “not important”. Much of statistics teaching is in a service course, as discussed in my previous blog. It is unlikely that any of these students will ever have to write a report on a statistical analysis, other than as part of the assessment for the course.  So why do we put them and ourselves through this?

You don’t realise whether you understand or not until you try to write it down.

The written word requires a higher level of precision than a thought or a spoken explanation. Your sentences look at you from the page and mock you with their vagueness and ambiguity. I find this out time and again as I blog. What seems like a well thought out argument in my head as I do my morning run, falls to shreds on paper, before being mustered into some semblance of order. It is in writing that we identify the flaws in our understanding. As we try to write our findings we become more aware of fuzzy thinking and gaps in reasoning. As we write we are required to organise our thoughts.

Better critics of other reports

A student who has been required to produce a report of a good standard will be exposed to examples of good and bad reports and will be better able to identify incorrect thinking in reports they read themselves. This is perhaps the most important purpose of a terminal course in statistics. Having said that, it is both heart-warming and alarming to hear from past-students the wonderful things they are doing with the statistics they learned in my one-semester course.

Useful skill for employment

Students need to be able to read and write as part of empowered citizenship. The skill of writing a coherent report in good English is highly sought after by employers, and of great use at university in just about every discipline. It is a transferable skill to many endeavours.

Reports are needed for assessment

On a practical level, if the teacher is going to evaluate understanding they need evidence to work from. A written report provides one form of evidence of understanding.

Report-writing is difficult to teach

Some maths teachers may feel inadequate in teaching “English”, as they see report-writing. They do not have the pedagogical content knowledge in teaching writing that they do for teaching algebra or percentages, for instance. Pedagogical content knowledge is more than the intersection of knowing a subject, and being able to teach in a general sort of way. It is the knowledge of how to teach a certain discipline, what is difficult to learners, and how to help them learn.

Some basic ideas for teaching report-writing

To write at good report you need to understand what is going on, have the appropriate vocabulary, and use a clear structure. Good teaching will emphasise understanding. Getting students to write sentences about output, and sharing them with their peers is a great way to identify misunderstandings. As these sentences are shared, the teacher can model the use of correct technical language. They can say, for instance, “You have the essence correct here, but there are some more precise terms you could use, such as …” Teachers can either give students outlines for reports, or they can give them several good reports and get the students to identify the underlying structure. I am a firm believer in the generous use of headings within a report. They provide signposts for writer and reader alike.

Report-writing requires practice. The assessment report should not be the first report of that type that a student writes. In the world of motivated students with no other demands on their time, it would be great to have them write up one assignment for the practice and then learn from that to produce a better one. I am aware that students tend not to do the work unless there is a grade attached to it, so it can be difficult to get a student to do a “practice report” ahead of the “real assessment.”  There are other alternatives that approximate this, however, which require less input from the teacher. One of these, the use of templates, is explained in an earlier post, Templates for statistical reports – spoon-feeding?

There is nothing wrong with using templates and “sensible sentences”. (not to be confused with “sensible sentencing”, which seems devoid of sense.) There are only so many ways to say that “the median number of pairs of shoes owned by women is ten.” It is also a difficult sentence to make sound elegant. Good reports will look similar. This is not creative-writing – it is report-writing. Sure the marking may be boring when all the reports seem very similar, but it is a small price to pay when you avoid banging your head against the desk at the bizarre and disorganised offerings.

This is but a musing on the teaching of report-writing. Glenda Francis, in  “An approach to report writing in statistics courses” identifies similar issues, and provides a fuller background to the problem. She also indicates that there is much to be done in developing this area of teaching and research. I will be providing professional development in this area over the next month to at least three groups of teachers, and I look forward to learning a great deal from them, as we explore these issues together.