Spreadsheets, statistics, mathematics and computational thinking

We need to teach all our students how to design, create, test, debug and use spreadsheets. We need to teach this integrated with mathematics, statistics and computational thinking. Spreadsheets can be a valuable tool in many other subject areas including biology, physics, history and geography, thus facilitating integrated learning experiences.

Spreadsheets are versatile and ubiquitous – and most have errors. A web search on “How many spreadsheets have errors?” gives alarming results. The commonly quoted figure is 88%. These spreadsheets with errors are not just little home spreadsheets for cataloguing your Lego collection or planning your next vacation. These spreadsheets with errors involve millions of dollars, and life-affecting medical and scientific research.

Using spreadsheets to teach statistics

Use a spreadsheet to draw graphs

One of the great contributions computers make to statistical analysis is the ability to display graphs of non-trivial sets of data without onerous drawing by hand. In the early 1980s I had a summer job as a research assistant to a history professor. One of my tasks was to create a series of graphs of the imports and exports for New Zealand over several decades, illustrating the effect of the UK joining the Common Market (now the EU). It required fastidious drawing and considerable time. (And correcting fluid) These same graphs can now be created almost instantaneously, and the requirement has shifted to interpreting these graphs.

Similarly, in the classroom we should not be requiring students of any age to draw statistical graphs by hand. Drawing statistical graphs by hand is a waste of time. Students may enjoy creating the graphs by hand – I understand that – it is rewarding and not cognitively taxing. So is colouring in. The important skill that students need is to be able to read the graph – to find out what it is telling them and what it is not telling them. Their time would be far better spent looking at multiple graphs of different types, and learning how to report and critique them. They also need to be able to decide what graph will best show what they are looking for or communicating. (There will be teachers saying students need to draw graphs by hand to understand them. I’d like to know the evidence for this claim. People have said for years that students need to calculate standard deviation by hand to understand it, and I reject that also.)

At primary school level, the most useful graph is almost always the bar or column chart. These are easily created physically using data cards, or by entering category totals and using a spreadsheet. Here is a video showing just how easy it is.

Use a spreadsheet for statistical calculations

Spreadsheets are also very capable of calculating summary statistics and creating hypothesis tests and confidence intervals. Dedicated statistical packages are better, but spreadsheets are generally good enough. I would also teach pivot-tables as soon as possible, but that is a topic for another day.

Using spreadsheets to teach mathematics

Spreadsheets are so versatile! Spreadsheets help students to understand the concept of a variable. When you write a formula in a cell, you are creating an algebraic formula. Spreadsheets illustrate the need for sensible rounding and numeric display. Use of order of operations and brackets is essential. They can be used for exploring patterns and developing number sense. I have taught algebraic graphing, compared with line fitting using spreadsheets. Spreadsheets can solve algebraic problems. Spreadsheets make clear the concept of mathematics as a model. Combinatorics and Graph Theory are also enabled through spreadsheets. For users using a screenreader, the linear nature of formulas in spreadsheets makes it easier to read.

Using spreadsheets to teach computational thinking

In New Zealand we are rolling out a new curriculum for information technology, including  computational thinking. At primary school level, computational thinking includes “[students] develop and debug simple programs that use inputs, outputs, sequence and iteration.” (Progress outcome 3, which is signposted to be reached at about Year 7) Later the curriculum includes branching.

In most cases the materials include unplugged activities, and coding using programmes such as Scratch or Java script. Robots such as Sphero and Lego make it all rather exciting.

All of these ideas can also be taught using a spreadsheet. Good spreadsheet design has clear inputs and outputs. The operations need to be performed in sequence, and iteration occurs when we have multiple rows in a spreadsheet. Spreadsheets need to be correct, robust and easy to use and modify. These are all important principles in coding. Unfortunately too many people have never had the background in coding and program design and thus their spreadsheets are messy, fragile, oblique and error-prone.

When we teach spreadsheets well to our students we are giving them a gift that will be useful for their life.

Experience teaching spreadsheets

I designed and taught a course in quantitative methods for business, heavily centred on spreadsheets. The students were required to use spreadsheets for mathematical and statistical tasks. Many students have since expressed their gratitude that they are capable of creating and using spreadsheets, a skill that has proved useful in employment.



Don’t teach significance testing – Guest post

The following is a guest post by Tony Hak of Rotterdam School of Management. I know Tony would love some discussion about it in the comments. I remain undecided either way, so would like to hear arguments.


It is now well understood that p-values are not informative and are not replicable. Soon null hypothesis significance testing (NHST) will be obsolete and will be replaced by the so-called “new” statistics (estimation and meta-analysis). This requires that undergraduate courses in statistics now already must teach estimation and meta-analysis as the preferred way to present and analyze empirical results. If not, then the statistical skills of the graduates from these courses will be outdated on the day these graduates leave school. But it is less evident whether or not NHST (though not preferred as an analytic tool) should still be taught. Because estimation is already routinely taught as a preparation for the teaching of NHST, the necessary reform in teaching will not require the addition of new elements in current programs but rather the removal of the current emphasis on NHST or the complete removal of the teaching of NHST from the curriculum. The current trend is to continue the teaching of NHST. In my view, however, teaching of NHST should be discontinued immediately because it is (1) ineffective and (2) dangerous, and (3) it serves no aim.

1. Ineffective: NHST is difficult to understand and it is very hard to teach it successfully

We know that even good researchers often do not appreciate the fact that NHST outcomes are subject to sampling variation and believe that a “significant” result obtained in one study almost guarantees a significant result in a replication, even one with a smaller sample size. Is it then surprising that also our students do not understand what NHST outcomes do tell us and what they do not tell us? In fact, statistics teachers know that the principles and procedures of NHST are not well understood by undergraduate students who have successfully passed their courses on NHST. Courses on NHST fail to achieve their self-stated objectives, assuming that these objectives include achieving a correct understanding of the aims, assumptions, and procedures of NHST as well as a proper interpretation of its outcomes. It is very hard indeed to find a comment on NHST in any student paper (an essay, a thesis) that is close to a correct characterization of NHST or its outcomes. There are many reasons for this failure, but obviously the most important one is that NHST a very complicated and counterintuitive procedure. It requires students and researchers to understand that a p-value is attached to an outcome (an estimate) based on its location in (or relative to) an imaginary distribution of sample outcomes around the null. Another reason, connected to their failure to understand what NHST is and does, is that students believe that NHST “corrects for chance” and hence they cannot cognitively accept that p-values themselves are subject to sampling variation (i.e. chance)

2. Dangerous: NHST thinking is addictive

One might argue that there is no harm in adding a p-value to an estimate in a research report and, hence, that there is no harm in teaching NHST, additionally to teaching estimation. However, the mixed experience with statistics reform in clinical and epidemiological research suggests that a more radical change is needed. Reports of clinical trials and of studies in clinical epidemiology now usually report estimates and confidence intervals, in addition to p-values. However, as Fidler et al. (2004) have shown, and contrary to what one would expect, authors continue to discuss their results in terms of significance. Fidler et al. therefore concluded that “editors can lead researchers to confidence intervals, but can’t make them think”. This suggests that a successful statistics reform requires a cognitive change that should be reflected in how results are interpreted in the Discussion sections of published reports.

The stickiness of dichotomous thinking can also be illustrated with the results of a more recent study of Coulson et al. (2010). They presented estimates and confidence intervals obtained in two studies to a group of researchers in psychology and medicine, and asked them to compare the results of the two studies and to interpret the difference between them. It appeared that a considerable proportion of these researchers, first, used the information about the confidence intervals to make a decision about the significance of the results (in one study) or the non-significance of the results (of the other study) and, then, drew the incorrect conclusion that the results of the two studies were in conflict. Note that no NHST information was provided and that participants were not asked in any way to “test” or to use dichotomous thinking. The results of this study suggest that NHST thinking can (and often will) be used by those who are familiar with it.

The fact that it appears to be very difficult for researchers to break the habit of thinking in terms of “testing” is, as with every addiction, a good reason for avoiding that future researchers come into contact with it in the first place and, if contact cannot be avoided, for providing them with robust resistance mechanisms. The implication for statistics teaching is that students should, first, learn estimation as the preferred way of presenting and analyzing research information and that they get introduced to NHST, if at all, only after estimation has become their routine statistical practice.

3. It serves no aim: Relevant information can be found in research reports anyway

Our experience that teaching of NHST fails its own aims consistently (because NHST is too difficult to understand) and the fact that NHST appears to be dangerous and addictive are two good reasons to immediately stop teaching NHST. But there is a seemingly strong argument for continuing to introduce students to NHST, namely that a new generation of graduates will not be able to read the (past and current) academic literature in which authors themselves routinely focus on the statistical significance of their results. It is suggested that someone who does not know NHST cannot correctly interpret outcomes of NHST practices. This argument has no value for the simple reason that it is assumed in the argument that NHST outcomes are relevant and should be interpreted. But the reason that we have the current discussion about teaching is the fact that NHST outcomes are at best uninformative (beyond the information already provided by estimation) and are at worst misleading or plain wrong. The point is all along that nothing is lost by just ignoring the information that is related to NHST in a research report and by focusing only on the information that is provided about the observed effect size and its confidence interval.


Coulson, M., Healy, M., Fidler, F., & Cumming, G. (2010). Confidence Intervals Permit, But Do Not Guarantee, Better Inference than Statistical Significance Testing. Frontiers in Quantitative Psychology and Measurement, 20(1), 37-46.

Fidler, F., Thomason, N., Finch, S., & Leeman, J. (2004). Editors Can Lead Researchers to Confidence Intervals, But Can’t Make Them Think. Statistical Reform Lessons from Medicine. Psychological Science, 15(2): 119-126.

This text is a condensed version of the paper “After Statistics Reform: Should We Still Teach Significance Testing?” published in the Proceedings of ICOTS9.


Excel, SPSS, Minitab or R?

I often hear this question: Should I use Excel to teach my class? Or should I use R? Which package is the best?

Update in April 2018: I have written a further post, covering other aspects and other packages.

It depends on the class

The short answer is: It depends on your class. You have to ask yourself, what are the attitudes, skills and knowledge that you wish the students to gain in the course. What is it that you want them to feel and do and understand?

If the students are never likely to do any more statistics, what matters most is that they understand the elementary ideas, feel happy about what they have done, and recognise the power of statistical analysis, so they can later employ a statistician.

If the students are strong in programming, such as engineering or computer science students, then they are less likely to find the programming a barrier, and will want to explore the versatility of the package.

If they are research students and need to take the course as part of a research methods paper, then they should be taught on the package they are most likely to use in their research.

Over the years I have taught statistics using Excel, Minitab and SPSS. These days I am preparing materials for courses using iNZight, which is a specifically designed user interface with an R engine. I have dabbled in R, but never had students who are suitable to be taught using R.

Here are my pros and cons for each of these, and when are they most suitable.


I have already written somewhat about the good and bad aspects of Excel, and the evils of Excel histograms. There are many problems with statistical analysis with Excel. I am told there are parts of the analysis toolpak which are wrong, though I’ve never found them myself. There is no straight-forward way to do a hypothesis test for a mean. The data-handling capabilities of the spreadsheet are fantastic, but the toolpak cannot even deal well with missing values. The output is idiosyncratic, and not at all intuitive. There are programming quirks which should have been eliminated many years ago. For example when you click on a radio button to say where you wish the output to go, the entry box for the data is activated, rather than the one for the output. It requires elementary Visual Basic to correct this, but has never happened. Each time Excel upgrades I look for this small fix, and have repeatedly been disappointed.

So, given these shortcomings, why would you use Excel? Because it is there, because you are helping students gain other skills in spreadsheeting at the same time, because it is less daunting to use a familiar interface. These reasons may not apply to all students. Excel is the best package for first year business students for so many reasons.

PivotTables in Excel are nasty to get your head around, but once you do, they are fantastic. I resisted teaching PivotTables for some years, but I was wrong. They may well be one of the most useful things I have ever taught at university. I made my students create comparative bar charts on Excel, using Pivot-Tables. One day Helen and I will make a video about PivotTables.


Minitab is a lovely little package, and has very nice output. Its roots as a teaching package are obvious from the user-friendly presentation of results. It has been some years since I taught with Minitab. The main reason for this is that the students are unlikely ever to have access to Minitab again, and there is a lot of extra learning required in order to make it run.


Most of my teaching at second year undergraduate and MBA and Masters of Education level has been with SPSS. Much of the analysis for my PhD research was done on SPSS. It’s a useful package, with its own peculiarities. I really like the data-handling in terms of excluding data, transforming variables and dealing with missing values. It has a much larger suite of analysis tools, including factor analysis, discriminant analysis, clustering and multi-dimensional scaling, which I taught to second year business students and research students.  SPSS shows its origins as a suite of barely related packages, in the way it does things differently between different areas. But it’s pretty good really.


R is what you expect from a command-line open-source program. It is extremely versatile, and pretty daunting for an arts or business major. I can see that R is brilliant for second-level and up in statistics, preferably for students who have already mastered similar packages/languages like MatLab or Maple. It is probably also a good introduction to high-level programming for Operations Research students.


This brings us to iNZight, which is a suite of routines using R, set in a semi-friendly user interface. It was specifically written to support the innovative New Zealand school curriculum in statistics, and has a strong emphasis on visual representation of data and results. It includes alternatives that use bootstrapping as well as traditional hypothesis testing. The time series package allows only one kind of seasonal model. I like iNZight. If I were teaching at university still, I would think very hard about using it. I certainly would use it for Time Series analysis at first year level. For high school teachers in New Zealand, there is nothing to beat it.

It has some issues. The interface is clunky and takes a long time to unzip if you have a dodgy computer (as I do). The graphics are unattractive. Sorry guys, I HATE the eyeball, and the colours don’t do it for me either. I think they need to employ a professional designer. SOON! The data has to be just right before the interface will accept it. It is a little bit buggy in a non-disastrous sort of way. It can have dimensionality/rounding issues. (I got a zero slope coefficient for a linear regression with an r of 0.07 the other day.)

But – iNZight does exactly what you want it to do, with lots of great graphics and routines to help with understanding. It is FREE. It isn’t crowded with all the extras that you don’t really need. It covers all of the New Zealand statistics curriculum, so the students need only to learn one interface.

There are other packages such as Genstat, Fathom and TinkerPlots, aimed at different purposes. My university did not have any of these, so I didn’t learn them. They may well be fantastic, but I haven’t the time to do a critique just now. Feel free to add one as a comment below!

Organising the toolbox in statistics and operations research

Don’t bury students in tools     

In our statistics courses and textbooks there is a tendency to hand our students tool after tool, wanting to teach them all they need to know. However students can feel buried under these tools and unable to decide which to use for which task. This is also true in beginning Operations Research or Management Science courses. To the instructors, it is obvious whether to use the test for paired or independent samples or whether to use multicriteria decision making or a decision tree.  But it is just another source of confusion for the student, who wants to be told what to do.

Tools for statistics and operations research

A common approach to teaching hypothesis testing in business statistics courses, if textbooks are anything to go by, is to teach several different forms of hypothesis testing, starting with the test for a mean, and test for a proportion then difference of two means, independent and paired, then difference of two proportions. Then we have tests for regression and correlation, and chi-squared test for independence. These are the seven basic statistical tests that people are likely to use or see. I would probably add ANOVA, if there is enough time. Even listed, this seems a bit confusing.

An introductory operations research course might include any number of topics including linear programming, integer programming, inventory control, queueing, simulation, decision analysis, critical path, the assignment problem, dynamic programming, systems analysis, financial modelling, inventory control…And I would hope some overall teaching about models and the OR process.

Issues with the pile of tools

Of course we need to teach the essential tools of our discipline, but there are two issues arising from this approach.

The obvious one is that students are left bewildered as to which test they should use when. Because of the way textbooks and courses are organised, students don’t usually have to decide which tool to use in a given situation. If the preceding chapter is about linear programming, then the questions will be about linear programming.

The second issue is that unless students are helped, they fail to see the connections between the techniques and are left with a fragmented view of the discipline. It is not just a question of which tool to use for which task, it is about seeing the linkages and the similarities. We want to help them have integrated knowledge.

Providing activities to help with organisation

In both my introductory courses I attempted to address this, with varying degrees of success.

In our management science course we end the year with a case of a situation with multiple needs, and the students were to identify which technique would be useful in each instance. Then the final exam has a similar question, with specific questions about over-arching concepts such as deterministic and stochastic inputs, and the purpose of the model – to optimise or inform. This is also an opportunity to address issues of ethics and worldview.

In the final section of the business statistics course we have a large bank of questions for students to work through, to give them practice in deciding which test to use. I was careful to make sure that there was more than one question related to each scenario, so that students would not learn unhelpful shortcuts, such as, if the question is about weight loss, the answer must be paired difference of two means. I also analysed the mistakes given in multichoice answers, to see where confusion was arising, sometimes due to poor wording. From this I refined the questions.

Examples of the questions for test choice in hypothesis testing

Management thinks there is a difference in productivity between the two days of the week in a certain work area. The production output of a random sample of 15 factory workers is recorded on both a Tuesday and a Friday of the same week. For each worker, the number of completed garments is counted on both days.

A restaurant manager is thinking of doing a special “girls’ night out” promotion. She suspects that groups of women together are more likely to stay for dessert than mixed adult or family groups. For the next two weeks she gets the staff to write down for each table whether they stay for dessert, and what type of group they are. She asks you to see if her suspicion is correct.

A human resources department has data on 200 past employees, including how long, in months, they stayed at the company, and the mark out of 100 they got in their recruitment assessment. They ask you to work out whether you can predict how long a person will stay, based on their test mark.

A researcher wanted to investigate whether a new diet was effective in helping with weight loss. She got 40 volunteers and got 20 to use the diet and the other 20 to eat normally. After 6 weeks the weights (in kg) before and after were recorded for each volunteer, and the difference calculated. She then looked at how the weight losses differed between the two groups.

Comment on the questions

You might notice that all the examples are in a business context. This is because this is a course for business students, and they need to know that what they are learning is relevant to their future. Questions about dolphins and pine trees are not suitable for these students. (Unless we are making money out of them!)

The master diagram

The students to work through these multiple choice questions on-line, and we offered help and coached them through questions with which they had difficulty. By taking my turn with the teaching assistants in the computer labs, I was able to understand better how the students perceived the tests, and ways to help them with this. The result is a diagram, or set of diagrams which shows the relationships between the tests, and a procedure to help them make the decision. I am a great believer in diagrams, but they need to be well thought out. Many textbooks have branching diagrams, showing a decision process for which test to use. I felt there was a more holistic way to approach it, and thought long and hard, and tried out my diagrams on students before I came up with our different approach. You can see the diagrams here by clicking on the link to the pdf which you can download: Choosing the test diagrams

The three questions which help the students to identify the most appropriate test are:

  1. What level of measurement is the data – Nominal or interval/ratio?
  2. How many samples do we have?
  3. What is the purpose of our analysis?

I made an on-line lesson which takes the students through the steps over and over, and created the diagrams to help them. Time and again the students said how much it helped them to fit it all together. Eventually I made the following video, which is on YouTube. I suspect it must be coming up to summary time in courses in the US, as this video has recently attracted a lot of views, and positive comments.

The video is also part of our app, AtMyPace: Statistics along with two sets of questions to help students to learn about the different types of tests and how to tell them apart. You can access the same resources on-line through AtMyPace:Statistics statsLC.com.

It is important to see the subject as a whole, and not a jumbled mass of techniques and ideas, and this has really helped my students and many others through the video and app.

Best wishes for the holiday season

It is Christmas time and here in Christchurch the sun is shining and barbecues and beaches are calling. I am taking a break from the blog for the great New Zealand shut-down and will be back in the New Year.

Thank you for all the followers and especially your comments, Likes and ReTweets.

Judgment Calls in Statistics and O.R.

The one-armed operations researcher

My mentor, Hans Daellenbach told me a story about a client asking for a one-armed Operations Researcher. The client was sick of getting answers that went, “On the one hand, the best decision would be to proceed, but on the other hand…”

People like the correct answer. They like certainty. They like to know they got it right.

I tease my husband that he has to find the best picnic spot or the best parking place, which involves us driving around considerably longer than I (or the children) were happy with. To be fair, we do end up in very nice picnic spots. However, several of the other places would have been just fine too!

In a different context I too am guilty of this – the reason I loved mathematics at school was because you knew whether you were right or wrong and could get a satisfying row of little red ticks (checkmarks) down the page. English and other arts subjects, I found too mushy as you could never get it perfect. Biology was annoying as plants were so variable, except in their ability to die. Chemistry was ok, so long as we stuck to the nice definite stuff like drawing organic molecules and balancing redox equations.

I think most mathematics teachers are mathematics teachers because they like things to be right or wrong. They like to be able to look at an answer and tell whether it is correct, or if it should get half marks for correct working. They do NOT want to mark essays, which are full of mushy judgements.

Again I am sympathetic. I once did a course in basketball refereeing. I enjoyed learning all the rules, and where to stand, and the hand signals etc, but I hated being a referee. All those decisions were just too much for me. I could never tell who had put the ball out, and was unhappy with guessing. I think I did referee two games at a church league and ended up with an angry player bashing me in the face with the ball. Looking back I think it didn’t help that I wasn’t much of a player either.

I also used to find marking exam papers very challenging, as I wanted to get it right every time. I would agonise over every mark, thinking it could be the difference between passing and failing for some poor student. However as the years went by, I realised that the odd mistake or inconsistency here or there was just usual, and within the range of error. To someone who failed by one mark, my suggestion is not to be borderline. I’m pretty sure we passed more people that we shouldn’t have, than the other way around.

Life is not deterministic

The point is, that life in general is not deterministic and certain and rule-based. This is where the great divide lies between the subject of mathematics and the practice of statistics. Generally in mathematics you can find an answer and even check that it is correct. Or you can show that there is no answer (as happened in one of our national exams in 2012!). But often in statistics there is no clear answer. Sometimes it even depends on the context. This does not sit well with some mathematics teachers.

In operations research there is an interesting tension between optimisers and people who use heuristics. Optimisers love to say that they have the optimal solution to the problem. The non-optimisers like to point out that the problem solved optimally, is so far removed from the actual problem, that all it provides is an upper or lower bound to a practical solution to the actual real-life problem situation.

Judgment calls occur all through the mathematical decision sciences. They include

  • What method to use – Linear programming or heuristic search?
  • Approximations – How do we model a stochastic input in a deterministic model?
  • Assumptions – Is it reasonable to assume that the observations are independent?
  • P-value cutoff – Does a p-value of exactly 0.05 constitute evidence against the null hypothesis?
  • Sample size – Is it reasonable to draw any inferences at all from a sample of 6?
  • Grouping – How do we group by age? by income?
  • Data cleaning – Do we remove the outlier or leave it in?

A comment from a maths teacher on my post regarding the Central Limit Theorem included the following: “The questions that continue to irk me are i) how do you know when to make the call? ii) What are the errors involved in making such a call? I suppose that Hypothesis testing along with p-values took care of such issues and offered some form of security in accepting or rejecting such a hypothesis. I am just a little worried that objectivity is being lost, with personal interpretation being the prevailing arbiter which seems inadequate.”

These are very real concerns, and reflect the mathematical desire for correctness and security. But I propose that the security was an illusion in the first place. There has always been personal interpretation.Informal inference is a nice introduction to help us understand that. And in fact it would be a good opportunity for lively discussion in a statistics class.

With bootstrapping methods we don’t have any less information than we did using the Central Limit Theorem. We just haven’t assumed normality or independence. There was no security. There was the idea that with a 95% confidence interval, for example, we are 95% sure that we contain the true population value. I wonder how often we realised that 1 in 20 times we were just plain wrong, and in quite a few instances the population parameter would be far from the centre of the interval.

The hopeful thing about teaching statistics via bootstrapping, is that by demystifying it we may be able to inject some more healthy scepticism into the populace.

What Mathematics teachers need to know about statistics

My post suggesting that statistics is more vital for efficient citizens than algebra has led to some interesting discussions on Twitter and elsewhere. Currently I am beginning an exciting venture to provide support materials for teachers and students of statistics, starting with New Zealand. These two circumstances have led me to ponder about why maths teachers think that statistics is a subset of mathematics, and what knowledge and attitudes will help them make the transition to teaching statistics as a subject.

An earlier post called for mathematics to leave statistics alone. This post builds on that by providing some ways of thinking that might be helpful to mathematics teachers who have no choice but to teach statistics.

Statistics is not a subset of mathematics

Let me quote a forum post from a teacher of mathematics in New Zealand:

  • “It seems strange to me that Statistics is a small part of Mathematics (which also includes Trigonometry, Algebra, Geometry, Calculus … but for our year 13s and now our year 12s, it’s attained equal parity as one area with all the other branches of maths put together as another area.”

This is very helpful as it lets us see where the writer is coming from. To him, statistics is a subset of mathematics – a small part, and somehow it has managed to push its way to become on equal footing with “mathematics.”

I disagree.

I think we need to take a look at the role of compulsory schooling. It is popular among people who go to university, and even more so among those who never leave (having become academics themselves) to think that the main, if not only role of school is to prepare students for university. If the students somehow have not gained the skills and knowledge that the university lecturers believe are necessary, then the schools have failed – or worse still, the system has failed. Again I disagree.

The vision for the young people of New Zealand is stated in the official curriculum.

“Our vision is for young people:

  • who will be creative, energetic, and enterprising
  • who will seize the opportunities offered by new knowledge and technologies to secure a sustainable social, cultural, economic, and environmental future for our country
  • who will work to create an Aotearoa New Zealand in which Māori and Pākehā recognise each other as full Treaty partners, and in which all cultures are valued for the contributions they bring
  • who, in their school years, will continue to develop the values, knowledge, and competencies that will enable them to live full and satisfying lives
  • who will be confident, connected, actively involved, and lifelong learners.”

It doesn’t actually mention preparing people for university.

My view is that school is about preparing young people for life, while helping them to enjoy the journey.

What teachers need to know about statistics

Statistics for life can be summed up as C, D, E, standing for Chance, Data and Evidence.


Students need to understand about the variability in their world. Probability is a mathematical way of modelling the inherent uncertainty around us. The mathematical part of probability includes combinatorics and the ability to manipulate tables. You can use Venn diagrams and trees if you like, and tables can be really useful too. The most difficult part for many students is converting the ideas into mathematical terminology and making sense of that.

Bear in mind that perceptions of chance have cultural implications. Some cultures play board games and other games of chance from a young age, and gain an inherent understanding of the nature of uncertainty as provided by dice. However there are other cultures for whom all things are decided by God, and nothing is by chance. There are many philosophical discussions which can be had regarding the nature of uncertainty and variability. The work of Tversky and Kahnemann and others have alerted us to the misconceptions we all have about chance.

An area where the understanding of probabilities and relative risk is vital is that of medical screening. Studies among medical practitioners have shown that many of them cannot correctly estimate the probability of a false positive, or the probability of a true positive, given that the result of a test is positive. This is easily conveyed through contingency tables, which are now part of the NZ curriculum.


When people talk about “statistics”, more often they are talking about data and information than the discipline of statistical analysis. Just about everyone is interested in some area of statistics. Note the obsession of the media for reporting the road toll and comparing with previous years or holiday periods. Sports statistics occupy many people’s thoughts, and can fill in the (long) gaps between the action in a cricket commentary. Weather statistics are vital to farmers, planners, environmentalists. Hospitals are now required to report various statistics. The web is full of statistics. It is difficult to think of an area of life which does not use statistics.The second thing we want to know about a new-born baby is its weight.

Just because data contains numbers does not make it mathematics. There are arithmetic skills, such as adding and dividing, which can be practised using data. But that’s about it when it comes to mathematics and data. These days we have computer packages which can calculate all sorts of summary values, and create graphs for better or worse, so the need for mathematical or numeracy skills is much diminished. What is needed is the ability to communicate ideas using numbers and diagrams; by communication I mean production and interpretation of reports and diagrams.

The area of data also includes the collection of data. This is taught at all levels of the NZ curriculum. Students are taught to think about measurement, both physical and through questionnaires. Eventually students learn to design experiments to explore new ideas. Some might see this as science or biology, social studies or psychology, technology and business. There are even applications in music where students explore people’s music preferences. Data occurs in all subjects, and really the skills of data analysis should be taught in context. But until the current generation of students become the teachers, we may need to rely on the teachers of statistics to provide support. There are wonderful opportunities for collaboration between disciplines, if our compartmentalised school system would allow them.


Much data is population data and conclusions can easily be drawn from it. However we also use samples to draw conclusions about populations. Inferential statistics has been developed using theoretical probability distributions to help us use samples to draw conclusions about populations. Unfortunately the most popular form of inference, hypothesis testing, is counter-intuitive at best. Many teachers do not truly understand the application of inferential statistics – and why should they – they may never have performed a real statistical analysis. It is only through repeated application of techniques to multiple contexts that most people can start to feel comfortable and get some understanding of what is happening. The beauty is that today the technology makes it possible for students to perform multiple analyses so that they can learn the specific from the general.

The New Zealand school system has taken the courageous* step to introduce the use of resampling, also known as bootstrapping or randomisation, for the generation of confidence intervals. This is contentious and is causing teachers concern. I will dedicate a whole post to the ideas of resampling and why they may be preferable to more traditional approaches. I empathise with the teachers who are feeling out of their depth, and hope that our materials, along with the excellent ones provided by “Census at School” can be of help.

I have no doubt that educators all over the world are watching to see how this goes before attempting similar moves in their own countries. Yet again New Zealand gets to lead the world. Watch this space!

*In the popular British television show, “Yes Minister”, the public servant, Sir Humphrey, would use the term “courageous” to describe a proposal which was probably right, but also likely to lose votes.

Statistics and chocolate

Some time ago I promised to blog about how to teach statistics with chocolate. Anyone who has watched my youtube videos may have noticed a recurring theme. Helen sells Choconutties. These are a fictitious chocolate bar, originally devised to require a table of prices, which would require the use of absolute and relatives references. As the series developed Helen had all sorts of issues with her sales, which required statistical analysis, often using spreadsheets. For the “choosing the test” video I managed to come up with seven different chocolate-based scenarios. I’ve had complaints from my students that using my materials makes them hungry.

One of my favourite lectures, back when I gave lectures, involved the use of chocolate in teaching about hypothesis testing. Here’s how it goes.

Pinkie, Mars and Crunchie bars are shown to the class

I come into class with a large opaque bag of small chocolate bars. I tell the class : “In my bag I have equal numbers of three types of bar – Pinkies, Crunchies and Mars bars.” I show them an example of each of the bars, placing them on the document camera base. Next I say, “I’m going to draw one bar at random, and if you can guess correctly what it is, you can keep it.” I call for volunteers, and even our normally reticent New Zealand students will join in, if chocolate is involved.

It’s a Pinkie!

So the first one guesses: “Mars”.

I draw out a bar – it is a Pinkie. – “Sorry, you don’t win a bar”




It’s another Pinkie!

Another student tries: “Crunchie”

I draw out another bar – it is a Pinkie. – “Sorry, no bar for you”





Pinkie number 3.

Third student: “Pinkie”

I draw out a bar – and yes it is a Pinkie. The students laugh and I toss the Pinkie to the student who guessed correctly.




Fourth student: “Mars”

I draw out a bar – another Pinkie. There are murmurings in the ranks.

Fifth student: “Pinkie” – wins a bar.

Sixth student: “Pinkie” – you guessed it – they win a Pinkie too.

It has become apparent that all is not as it ought to be. I up-end the bag on the front desk and the shiny pink bars pour out. I was not entirely truthful when I told them what was in the bag, as it contained only Pinkies.

All is revealed

So then I ask them – when did you get suspicious that the bag was not one-third pinkies?

The answer is usually that they get suspicious at three in a row, but their suspicions are confirmed at four Pinkies in a row. So then we look at the probabilities of those occurrences. If the bag really did contain one-third Pinkies, then the probability of getting three pinkies in a row is 1/3 to the power of 3, or 1/27 or 0.037. However the probability of four pinkies in a row is just over 1%. We doubt the initial assumptions when an occurrence of low probability occurs. It was possible that the bag contained what I said it did, but the evidence was against it.

From here we move on to the null hypothesis. In this case the null hypothesis is that the proportion of Pinkies is one-third. We took a sample of Pinkies, and found that the likelihood of getting that sample, if the null hypothesis is true, is very small. So then we need to revise our view of the world – or in this case the contents of the bag and the veracity of the lecturer.

I have carried on with a second exercise, where the bag contains no Pinkies, but Mars and Crunchies. Obviously it takes longer to suspect that there are no Pinkies. However, the students are more suspicious because the previous experience has taught them not to trust the instructor. My colleague used the same bars in a lecture a few weeks later, with many of the same students, and found they were suspicious of him also.

I have found this to be a worthwhile exercise. It is delightful to watch the dawning suspicion and amusement when they find that all is not as it seems. Then as we work through hypothesis testing and other inference I come back time and again to the idea of evidence. How much evidence did we need before we decided that the bag was not one-third Pinkies? We didn’t need to use chocolate, but it certainly makes the lesson memorable.

A picture is worth…

I don’t believe in learning styles. The idea of visual, audio, tactile and kinaesthetic learners has been popular in the last decade, but has done a great disservice to many learners labelled “kinaesthetic” and left to play with the blocks in the corner. So when a person tells me they are a visual learner, in the same defining tone they would use to state their height or eye color, I wince inside.

What I do believe is that effective learners use many different ways to learn, and for most of us a well-thought-out diagram will help in understanding and retaining new knowledge.

I love diagrams, but call me a visual learner and I will be tempted to slap you. I also like to have things explained out loud, and I like to try things for myself, and I really like to make and eat cookies. None of those things define me.

But I do love diagrams.

Explaining the p-value with diagrams

Diagrams should either be correct and understandable, or a useful metaphor. Do a Google search for a picture to represent a p-value and you will get a screen full of bell curves with little shady bits in one or both tails. For the 99% of the population learning about p-values, this is USELESS! They do not understand what the curve represents, let alone the shady bit in the corner or its relationship to the p-value. So give it up.

Results of a Google search for images of teh

Results of a Google search for images of the p-value

In making a video about the p-value, I wanted some useful pictures that would help people to understand and remember how to use a p-value. Videos without pictures are rather missing the point. Most students aren’t ready yet to know where the p-value comes from, so my aim is for them to use it correctly.

The personification of hypotheses and the p-value

So I came up with a metaphor involving little people representing the Null and Alternative hypotheses and a big (or small) yellow letter P. This develops a memorable mental representation when the poor null hypothesis keeps getting rejected. You can see this in the video, Understanding the p-value.

Diagrams for teaching percentages

As part of the Quantitative Methods for Business course we teach percentages. These are surprisingly tricky. We include questions like “A lawyer’s bill came to $4167.60 including GST (Goods and Services Tax) of 15%. How much was the GST component?”

After several years of teaching and helping students, I decided we needed a diagram. After considerable thinking I produced this.

Percentage calculations

Students love this!

Students find the diagram really useful for dealing with just about any percentage question. I teach them how to construct the diagram for answering questions about increasing and decreasing by a percentage. It is gratifying to see them drawing their own diagrams to answer questions.

Visual images in teaching Linear Programming

Most teachers of LP like to start with the two decision variables to teach Linear Programming. This way it can be drawn on a Cartesian plane. This is great for maths and engineering students who are at home on the Cartesian plane. It is really good for understanding shadow prices and binding and slack constraints. However for business students, we found this added an extra level of incomprehension.

A consistent, logical LP layout helps learning

For this reason we go straight to the spreadsheet model, and standardise the look of an LP. This helps them to gain a visual image of an LP, with objective function, decision variables, and constraint values in different colours. It is worth thinking about whether the Cartesian plane with a trivial example is right for your students.


Like most of the claims about learning styles, the ideas expressed in this post are not based on empirical research. However, they are the result of over twenty years of teaching and reflection, and this is a blog, not a peer-reviewed journal article. 😉

Drill and Rote in teaching LP and Hypothesis Testing

Drill and rote-learning are derogatory terms in many education settings. They have the musty taint of “old-fashioned” ways of teaching. They evoke images of wooden classrooms and tight-lipped spinsters dressed in grey looming over trembling pupils as they recite their times-tables. Drill and rote-learning imply mindless repetition, devoid of understanding.

Much more attractive educational terms are “discovery”, “exploration”, “engagement”. Constructivism requires that learners engage with their materials and create learning by building on existing knowledge and experiences.

But (and I’m sure you could see this coming) I think there is a place for something not far from drill or rote-learning when teaching statistics and operations research. However I like to call it “well-designed repetitive practice”, rather than drill or rote-learning. With another name it smells a little sweeter.

Students need repeated exposure to and exploration of spreadsheet Linear Programming models in order to generalize and construct their own understanding correctly. Students benefit from repeated exposure to hypothesis testing in different contexts in order to discern the general from the specific. But this is not “mindless repetition” of similar examples where wrong generalizations can (and will) be constructed. The different examples should be carefully managed to make effective use of students’ time, and avoid reinforcement of incorrect concepts.

Reason for well-designed repetitive practice

A single instance of a phenomenon does not provide enough information to transfer to another instance. It is only by being exposed to multiple instances that learners can decide which aspects are in common or general, and which are specific to that particular example. Exploring one instance of a linear program (LP) in a standard format gives an initial understanding, but in order to generalize, there must be multiple examples.

Learners, in general, endeavor to make sense of the material by making generalizations about the different examples they are given. If the common elements they perceive are not relevant, the learners make incorrect generalizations. If the first three examples of an LP spreadsheet have all decision variables in the same units, students can reasonably assume that LPs require decision variables to use the same units. To avoid this, the set of examples used must be carefully constructed. If all the hypothesis testing examples result in rejecting the null hypothesis, students gain an incorrect generalization that this is the usual result.

It is popular practice in entry-level statistics courses to require students to collect their own data, analyse and report on it. This is a wonderful way for students to learn and engage with the process of statistical analysis. My concern is that it gives only one example from which the student can construct their understanding of the process. Ideally students would have exposure to many different examples before embarking on their own project.

A learning management system is invaluable. We have a bank of very carefully constructed examples which students work through, to help them gradually develop understanding. The data is real – from questionnaires they or earlier classes completed. There is immediate feedback on submission of their answers, again to reinforce correct concepts. We explain to students that they should not to wait until they understand the process completely before they begin, but rather that the understanding comes with doing. There are many parallels for this kind of learning. Chess, sports, driving and speaking a language all develop through practice. Understanding follows practice.

What’s more, this method seems to work. Students are motivated to work through multiple examples so that they internalize the process and improve their understanding. And they gain a sense of accomplishment and confidence at correctly completing the examples.