The Class-size debate – it matters to teachers

Class size matters to teachers

Class size is a perennial question in education. What is the ideal size for a school class? Teachers would like smaller classes, to improve learning. There is evidence of a small positive effect size due to reducing class size from meta-analysis published in John Hattie’s Visible Learning. But it makes sense, teachers argue – fewer children in the class means more opportunities for one-to-one interactions with the teacher. It makes for easier crowd control, less noise and less stress for teachers and pupils. And in these days of National Standards, it makes the assessment load more realistic.

Educational Research is difficult

I’d just like to point out that educational research is difficult. One of my favourite readings on educational research is an opinion piece by David Berliner, Educational Research: The hardest science of all,  where he explains the challenge of educational research. It was written in response to a call by the US Government for evidence-based practices in education. Berliner reminds us of how many different factors contribute to learning. And measuring learning is itself an inexact science. At one point he asks: “It may be stretching a little, but imagine that Newton’s third law worked well in both the northern and southern hemispheres—except of course in Italy or New Zealand—and that the explanatory basis for that law was different in the two hemispheres. Such complexity would drive a physicist crazy, but it is a part of the day-to-day world of the educational researcher.”

Ask the teachers

So with this in mind, I decided to ask the experts. I asked NZ primary school teachers who are just gearing up for the 2017 school year. These teachers were invited via a Facebook group to participate in a very short poll using a Google Form. There were just eight questions – the year level they teach, the minimum, maximum and ideal size for a class at that level, how many children they are expecting in their class this year and how long they have been teaching. The actual wording for the question about ideal class size was: “In your opinion what is the ideal class size that will lead to good learning outcomes for the year level given above?” There were also two open-ended questions about how they had chosen their numbers, and what factors they think contribute to the decision on class-size.

Every time I do something like this, I underestimate how long the analysis will take. There were only eight questions, thought I. How hard can that be…. sigh. But in the interests of reporting back to the teachers as quickly as possible, I will summarise the numeric data, and deal with all the words later.

Early results

There were about 200 useable responses. There was a wide range of experience within the teachers. A third of the teachers had been teaching for five years or shorter, and 20% had been teaching for more than twenty years. There was no correlation between the perceived ideal class size and the experience of the teacher.

The graph below displays the results, comparing the ideal class-size for the different year levels. Each dot represents the response of one teacher. It is clear that the teachers believe the younger classes require smaller classes. The median value for the ideal class size for a New Entrant, Year 1 and/or Year 2 class is 16. The median value for the ideal class size for Year 3/4 is 20, for Year 5/6 is 22 and for year 7/8 is 24. The ideal class size increases as the year level goes up. It is interesting that even numbers are more popular than odd numbers. In the comments, teachers point out that 24 is a very good number for splitting children into equal-sized groups.

These dotplot/boxplots from iNZight show each of the responses, and the summary values.

These dotplot/boxplots from iNZight show each of the responses, and the summary values.

It is interesting to compare the maximum class size the teachers felt would lead to good learning outcomes. I also asked what class size they will be teaching this year.  The table below gives the median response for the ideal class size, maximum acceptable, and current class size. It is notable that the current class sizes are all at least two students more than the maximum acceptable values, and between six and eight students more than the ideal value.

Median response
Year Level Number of respondents Ideal class size Maximum acceptable Current
New Entrant Year 1/2 56 16 20 22
Year 3/4 40 20 24.5 27.5
Year 5/6 53 22 25 30
Year 7/8 46 24 27 30

Financial considerations

It appears that most teachers will be teaching classes that are considerably larger than desired. This looks like a problem. But it is also important to get the financial context. I asked myself how much money would it take to reduce all primary school classes by four pupils (moving below the maximum, but more than the ideal)? Using figures from the Ministry of Education website, and assuming the current figures from the survey are indicative of class sizes throughout New Zealand, we would need about 3500 more classes. That is 3500 more rooms that would need to be provided, and 3500 more teachers to employ. It is an 18% increase in the number of classes. The increase in salaries alone would be over one hundred million dollars per year. This is not a trivial amount of money. It would certainly help with unemployment, but taxes would need to increase, or money would need to come from elsewhere.

Is this the best way to use the money? Should all classes be reduced or just some? How would we decide? How would it be implemented? If you decrease class sizes suddenly you create a shortage of teachers, and have to fill positions with untrained teachers, which has been shown to decrease the quality of education. Is the improvement worth the money?

My sympathies really are with classroom teachers. (If I were in charge, National Standards would be gone by lunchtime.) I know what a difference a few students in a class makes to all sorts of things. At the same time, this is not a simple problem, and the solution is far from simple. Discussion is good, and informed discussion is even better. Please feel free to comment below. (I will summarise the open-ended responses from the survey in a later post.)

Why people hate statistics

This summer/Christmas break it has been my pleasure to help a young woman who is struggling with statistics, and it has prompted me to ask people who teach postgraduate statistical methods – WTF are you doing?

Louise (name changed) is a bright, hard-working young woman, who has finished an undergraduate degree at a prestigious university and is now doing a Masters degree at a different prestigious university, which is a long way from where I live and will remain nameless. I have been working through her lecture slides, past and future and attempting to develop in her some confidence that she will survive the remainder of the course, and that statistics is in fact fathomable.

Incomprehensible courses alienating research students

After each session with Louise I have come away shaking my head and wondering what this lecturer is up to. I wonder if he/she really understands statistics or is just passing on their own confusion. And the very sad thing is that I KNOW that there are hundreds of lecturers in hundreds of similar courses around the world teaching in much the same way and alienating thousands of students every year.

And they need to stop.

Here is the approach: You have approximately eight weeks, made up of four hour sessions, in which to teach your masters students everything they could possibly need to know about statistics. So you tell them everything! You use technical terms with little explanation, and you give no indication of what is important and what is background. You dive right in with no clear purpose, and you expect them to keep up.

Choosing your level

Frequently Louise would ask me to explain something and I would pause to think. I was trying to work out how deep to go. It is like when a child asks where babies come from. They may want the full details, but they may not, and you need to decide what level of answer is most appropriate. Anyone who has seen our popular YouTube videos will be aware that I encourage conceptual understanding at best, and the equivalent of a statistics drivers licence at worst. When you have eight weeks to learn everything there is to know about statistics, up to and including multiple regression, logistic regression, GLM, factor analysis, non-parametric methods and more, I believe the most you can hope for is to be able to get the computer to run the test, and then make intelligent conclusions about the output.

There was nothing in the course about data collection, data cleaning, the concept of inference or the relationship between the model and reality. My experience is that data cleaning is one of the most challenging parts of analysis, especially for novice researchers.

Use learning objectives

And maybe one of the worst problems with Louise’s course was that there were no specific learning objectives. One of my most popular posts is on the need for learning objectives. Now I am not proposing that we slavishly tell students in each class what it is they are to learn, as that can be tedious and remove the fun from more discovery style learning. What I am saying is that it is only fair to tell the students what they are supposed to be learning. This helps them to know what in the lecture is important, and what is background. They need to know whether they need to have a passing understanding of a test, or if they need to be able to run one, or if they need to know the underlying mathematics.

Take for example, the t-test. There are many ways that the t-statistic can be used, so simply referring to a test as a t-test is misleading before you even start. And starting your teaching with the statistic is not helpful. We need to start with the need! I would call it a test for the difference of two means from two groups. And I would just talk about the t statistic in passing. I would give examples of output from various scenarios, some of which reject the null, some of which don’t and maybe even one that has a p-value of 0.049 so we can talk about that. In each case we would look at how the context affects the implications of the test result. In my learning objectives I would say: Students will be able to interpret the output of a test for the difference of two means, putting the result in context. And possibly, Students will be able to identify ways in which a test for the difference of two means violates the assumptions of a t-test. Now that wasn’t hard was it?

Like driving a car

Louise likes to understand where things come from, so we did go through an overview of how various distributions have been found to model different aspects of the world well – starting with the normal distribution, and with a quick jaunt into the Central Limit Theorem. I used my Dragonistics data cards, which were invented for teaching primary school, but actually work surprisingly well at all levels! I can’t claim that Louise understands the use of the t distribution, but I hope she now believes in it. I gave her the analogy of learning to drive – that we don’t need to know what is happening under the bonnet to be a safe driver. In fact safe driving depends more on paying attention to the road conditions and human behaviour.


Louise tells me that her lecturer emphasises assumptions – that the students need to examine them all, every time they look at or perform a statistical test. Now I have no problems with this later on, but students need to have some idea of where they are going and why, before being told what luggage they can and can’t take. And my experience is that assumptions are always violated. Always. As George Box put it – “All models are wrong and some models are useful.”

It did not help that the lecturer seemed a little confused about the assumption of normality. I am not one to point the finger, as this is a tricky assumption, as the Andy Field textbook pointed out. For example, we do not require the independent variables in a multiple regression to be normally distributed as the lecturer specified. This is not even possible if we are including dummy variables. What we do need to watch out for is that the residuals are approximately modelled by a normal distribution, and if not, that we do something about it.

You may have gathered that my approach to statistics is practical rather than idealistic. Why get all hot and bothered about whether you should do a parametric or non-parametric test, when the computer package does both with ease, and you just need to check if there is any difference in the result. (I can hear some purists hyperventilating at this point!) My experience is that the results seldom differ.

What post-graduate statistical methods courses should focus on

Instructors need to concentrate on the big ideas of statistics – what is inference, why we need data, how a sample is collected matters, and the relationship between a model and the reality it is modelling. I would include the concept of correlation, and its problematic link to causation. I would talk about the difference between statistical significance and usefulness, and evidence and strength of a relationship. And I would teach students how to find the right fishing lessons! If a student is critiquing a paper that uses logistical regression, that is the time they need to read up enough about logistical regression to be able to understand what they are reading.They cannot possibly learn a useful amount about all the tests or methods that they may encounter one day.

If research students are going to be doing their own research, they need more than a one semester fly-by of techniques, and would be best to get advice from a statistician BEFORE they collect the data.

Final word

So here is my take-home message:

Stop making graduate statistical methods courses so outrageously difficult by cramming them full of advanced techniques and concepts. Instead help students to understand what statistics is about, and how powerful and wonderful it can be to find out more about the world through data.

Your word

Am I right or is my preaching of the devil? Please add your comments below.

Has the Numeracy Project failed?

The Numeracy Development Project has influenced the teaching of mathematics in New Zealand. It has changed the language people use to talk about mathematical understanding, introducing the terms “multiplicative thinking”, “part-whole” and “proportional reasoning” to the teacher toolkit. It has empowered some teachers to think differently about the teaching of mathematics. It has brought “number” front and centre, often crowding out algebra, geometry, measurement and statistics, which are now commonly called the strands. It has baffled a large number of parents. Has the Numeracy Development Project been a success? If not, how can we fix it?

I have been pondering about the efficacy and side-effects of the Numeracy Project in New Zealand. I have heard criticisms from Primary and Secondary teachers, and defense and explanation from advisors. I have listened to a very illuminating podcast from one of the originators of the Numeracy Project, Ian Stevens, I have had discussions with another educational developer who was there at the beginning. I even downloaded some of the “pink booklets” and began reading them, in order understand the Numeracy Project.

Then I read this article from the US organisation, National Council of Teachers of Mathematics, Strategies are not Algorithms,  and it all started to fall into place.
The authors explain that researchers analysed the way that children learn about mathematics, and the stages they generally go through. It was found that “Students who used invented strategies before they learned standard algorithms demonstrated better knowledge of base-ten number concepts and were more successful in extending their knowledge to new situations than were students who initially learned standard algorithms.” They claim that in the US “(t)he idea of “invented strategies” has been distorted to such a degree that strategies are being treated like algorithms in many textbooks and classrooms across the country.” I suspect this statement also applies in New Zealand.

Strategies taught as algorithms

Whitacre and Wessenberg refer to a paper by Carpenter et al, A Longitudinal Study of Invention and Understanding in Children’s Multidigit Addition and Subtraction. I was able to get access to read it, and found the following:
“Although we have no data regarding explicit instruction on specific invented strategies, we hypothesize that direct instruction could change the quality of children’s understanding and use of invented strategies. If these strategies were the object of direct instruction, there would be a danger that children would learn them as rote procedures in much the way that they learn standard algorithms today.” (Emphasis added)

Were they right? Are the strategies being taught as rote procedures in some New Zealand classrooms? Do we need to change the way we talk about them?

How I see the Numeracy Development Project (NDP)

The NDP started as a way to improve teacher pedagogical content knowledge to improve outcomes for students. It was intended to cover all aspects of the New Zealand Mathematics and Statistics curriculum, not just number. Ian Stevens explained: “Numeracy was never just Number. We decided that in New Zealand numeracy meant mathematics and mathematics meant numeracy.”

The Numeracy Development Project provided a model to understand progression of understanding in learning mathematics. George Box once said “All models are wrong and some models are useful.” A model of progression of understanding is useful for identifying where we are, and how to progress to where we would like to be, rather like a map. But a map is not the landscape, and children differ, circumstances change, and models in education change faster than most. I recently attended a talk by Shelley Dole, who (I think) suggested that by emphasising additive thinking in the early school years, we may undo the multiplicative and proportional thinking the students had already. If all they see is adding and subtracting, any implication towards multiplicative and proportional thinking is stifled. It is an interesting premise.
The Numeracy Project (as it is now commonly called) suggested teaching methods, strongly based around group-work and minimising the use of worksheets. Popular invented strategies for arithmetic operations were described, and the teaching of standard algorithms such as vertical alignment of numbers when adding and subtracting was de-emphasised.
An unintended outcome is that the Numeracy Project has replaced the NZ curriculum in some schools, with “Number” taking centre stage for many years. Teachers are teaching invented strategies as algorithms rather than letting students work them out for themselves. At times students are required to know all the strategies before moving on. Textbooks, worksheets and even videos based around the strategies abound, which seems anathema to the original idea.

Where now?

So where do we go from here?

To me empowerment of teachers is pivotal. Teachers need to understand and embrace the beauty of number theory, the practicality of measurement, the art and challenge of geometry, the detective possibilities in data and the power of algebra to model our world. When mathematics is seen as a way to view the world, and embedded in all our teaching, in the way literacy is, maybe then, we will see the changes we seek.

Why Journalists need to understand statistics – Sensational Listener article about midwifery risks

The recent article in the Listener highlights again the need for all citizens to  be statistically literate. In particular I believe statistical literacy should be a compulsory part of all journalists’ training. I have written before about this. I was happy to see letters to the Editor in the 22 October issue of the Listener condemning the sensationalist cover, which was not supported in the article, and even less supported in the original research. I like the Listener, and subscribe, but this was badly done!

The following was written by a fellow statistician, John Maindonald and published here with his permission.

Midwife led vs Medical led models of care

A just published major observational study, comparing midwife led with medical led models of care has attracted extensive media attention.  The front cover of the NZ Listener (October 8) presented the “results” in particularly sensationalist terms (“ALARMING MATERNITY RESEARCH …”).

Much more alarming is what this sensationalist cover page has made of results that are at an optimistic best suggestive.

Adjustments, inevitably simplistic, were made for 8 factors in which the groups differed.  There is, with so many factors operating, no good way to be sure that the inevitably simple forms of adjustment were adequate.  Additionally, there will have been differences in mothers’ circumstances that the deprivation index used was too crude to capture.  Substance abuse was not taken into consideration.

Here are further links:

(Otago U PR)

(mildly skeptical comments)

(the paper)

I am disappointed that in its response to criticism of its presentation in Letters to the Editor, the Listener (October 22) continues to defend its reporting.

John Maindonald.

Mathematics teaching Rockstar – Jo Boaler

Moving around the education sector

My life in education has included being a High School maths teacher, then teaching at university for 20 years. I then made resources and gave professional development workshops for secondary school teachers. It was exciting to see the new statistics curriculum being implemented into the New Zealand schools. And now we are making resources and participating in the primary school sector. It is wonderful to learn from each level of teaching. We would all benefit from more discussion across the levels.

Educational theory and idea-promoters

My father used to say (and the sexism has not escaped me) “Never run after a woman, a bus or an educational theory, as there will be another one along soon.” Education theories have lifespans, and some theories are more useful than others. I am not a fan of “learning styles” and fear they have served many students ill. However, there are some current ideas and idea-promoters in the teaching of mathematics that I find very attractive. I will begin with Jo Boaler, and intend to introduce you over the next few weeks to Dan Meyer, Carol Dweck and the person who wrote “Making it stick.”

Jo Boaler – Click here for official information

My first contact with Jo Boaler was reading “The Elephant in the Classroom.” In this Jo points out how society is complicit in the idea of a “maths brain”. Somehow it is socially acceptable to admit or be almost defensively proud of being “no good at maths”. A major problem with this is that her research suggests that later success in life is connected to attainment in mathematics. In order to address this, Jo explores a less procedural approach to teaching mathematics, including greater communication and collaboration.

Mathematical Mindsets

It is interesting to  see the effect Jo Boaler’s recent book, “Mathematical Mindsets “, is having on colleagues in the teaching profession. The maths advisors based in Canterbury NZ are strong proponents of her idea of “rich tasks”. Here are some tweets about the book:

“I am loving Mathematical Mindsets by @joboaler – seriously – everyone needs to read this”

“Even if you don’t teach maths this book will change how you teach for ever.”

“Hands down the most important thing I have ever read in my life”

What I get from Jo Boaler’s work is that we need to rethink how we teach mathematics. The methods that worked for mathematics teachers are not the methods we need to be using for everyone. The defence “The old ways worked for me” is not defensible in terms of inclusion and equity. I will not even try to boil down her approach in this post, but rather suggest readers visit her website and read the book!

At Statistics Learning Centre we are committed to producing materials that fit with sound pedagogical methods. Our Dragonistics data cards are perfect for use in a number of rich tasks. We are constantly thinking of ways to embed mathematics and statistics tasks into the curriculum of other subjects.

Challenges of implementation

I am aware that many of you readers are not primary or secondary teachers. There are so many barriers to getting mathematics taught in a more exciting, integrated and effective way. Primary teachers are not mathematics specialists, and may well feel less confident in their maths ability. Secondary mathematics teachers may feel constrained by the curriculum and the constant assessment in the last three years of schooling in New Zealand. And tertiary teachers have little incentive to improve their teaching, as it takes time from the more valued work of research.

Though it would be exciting if Jo Boaler’s ideas and methods were espoused in their entirety at all levels of mathematics teaching, I am aware that this is unlikely – as in a probability of zero. However, I believe that all teachers at all levels can all improve, even a little at a time. We at Statistics Learning Centre are committed to this vision. Through our blog, our resources, our games, our videos, our lessons and our professional development we aim to empower all teacher to teach statistics – better! We espouse the theories and teachings explained in Mathematical Mindsets, and hope that you also will learn about them, and endeavour to put them into place, whatever level you teach at.

Do tell us if Jo Boalers work has had an impact on what you do. How can the ideas apply at all levels of teaching? Do teachers need to have a growth mindset about their own ability to improve their teaching?

Here are some quotes to leave you with:

Mathematical Mindsets Quotes

“Many parents have asked me: What is the point of my child explaining their work if they can get the answer right? My answer is always the same: Explaining your work is what, in mathematics, we call reasoning, and reasoning is central to the discipline of mathematics.”
“Numerous research studies (Silver, 1994) have shown that when students are given opportunities to pose mathematics problems, to consider a situation and think of a mathematics question to ask of it—which is the essence of real mathematics—they become more deeply engaged and perform at higher levels.”
“The researchers found that when students were given problems to solve, and they did not know methods to solve them, but they were given opportunity to explore the problems, they became curious, and their brains were primed to learn new methods, so that when teachers taught the methods, students paid greater attention to them and were more motivated to learn them. The researchers published their results with the title “A Time for Telling,” and they argued that the question is not “Should we tell or explain methods?” but “When is the best time do this?”
“five suggestions that can work to open mathematics tasks and increase their potential for learning: Open up the task so that there are multiple methods, pathways, and representations. Include inquiry opportunities. Ask the problem before teaching the method. Add a visual component and ask students how they see the mathematics. Extend the task to make it lower floor and higher ceiling. Ask students to convince and reason; be skeptical.”

All quotes from

Jo Boaler, Mathematical Mindsets: Unleashing Students’ Potential through Creative Math, Inspiring Messages and Innovative Teaching

Teachers and resource providers – uneasy bedfellows

Trade stands and cautious teachers

It is interesting to provide a trade stand at a teachers’ conference. Some teachers are keen to find out about new things, and come to see how we can help them. Others studiously avoid eye-contact in the fear that we might try to sell them something. Trade stand holders regularly put sweets and chocolate out as “bait” so that teachers will approach close enough to engage. Maybe it gives the teachers an excuse to come closer? Either way it is representative of the uneasy relationship that “trade” has with salaried educators.

Money and education

Money and education have an uneasy relationship. For schools to function, they need considerable funding – always more than what they get. In New Zealand, and in many countries, education is predominantly funded by the state. Schools are built and equipped, teachers are paid and resources are purchased with money provided by the taxpayer. Extras are raised through donations from parents and fund-raising efforts. However, because it is not apparent that money is changing hands, schools are perceived as virtuous establishments, existing only because of the goodness of the teachers. This contrasts with the attitude to resource providers, who are sometimes treated as parasitic with their motives being all about the money. It is possible that some resource providers are in it just for the money, but it seems to me that there are richer seams to mine in health, sport, retail etc.

Statistics Learning Centre is a social enterprise

Statistics Learning Centre is a social enterprise. We fit in the fuzzy area between “not-for-profit” and commercial enterprise. We measure our success by the impact we are having in empowering teachers to teach statistics and all people to understand statistics. We need money in order to continue to make an impact. Statistics Learning Centre has made considerable contributions to the teaching and learning of statistics in New Zealand and beyond for several years. This post lists just some of the impact we have had.  We believe in what we are doing, and work hard so that our social enterprise is on a solid financial footing.

StatsLC empowers teachers

Soon after the change to the NCEA Statistics standards, there was a shortage of good quality practice external exams. Even the ones provided as official exemplars did not really fit the curriculum. Teachers approached us, requesting that we create practice exams that they could trust were correct and aligned to the curriculum. We did so in 2015 and 2016, at considerable personal effort and only marginal financial recompense. We see that as helping statistics to be better understood in schools and the wider community.

We, at Statistics Learning Centre, grasp at opportunities to teach teachers how to teach statistics better, to empower all teachers to teach statistics. Our workshops are well received, and we have regular attenders who know they will get value for their time. We use an inclusive, engaging approach, and participants have a good time. I believe in our resources – the videos, the quizzes, the data cards, the activities, the professional development. I believe that they are among the best you can get. So when I give workshops, I do talk about the resources. It would seem counter-productive for all concerned, not to mention contrived, to do otherwise. They are part of a full professional development session. Many mathematical associations have no trouble with this, and I love to go to conferences, and contribute.

I am aware that there are some commercial enterprises who wish to give commercial presentations at conferences. If their materials are not of a high standard, this can put the organisers in a difficult position. Consequently some organisations have a blanket ban on any presentations that reference any paid product. I feel this is a little unfortunate, as teachers miss out on worthwhile contributions. But I understand the problem.

The Open Market model – supply and demand

I believe that there is value in a market model for resources.  People have suggested that we should get the Government to fund access to Statistics Learning Centre resources for all schools. That would be delightful, and give us the freedom and time to create even better resources. But that would make it almost impossible for any other new provider, who may have an even better product, to get a look in. When such a monopoly occurs, it reduces the incentives for providers to keep improving.

Saving work for the teachers, and building on a product

Teachers want the best for their students, and have limited budgets. They may spend considerable amounts of time printing, cutting and laminating in order to provide teaching resources at a low cost. This was one of the drivers for producing our Dragonistics data cards – to provide at a reasonable cost, some ready-made, robust resources, so that teachers did not have to make their own. As it turned out we were able to provide interesting data with clear relationships, and engaging graphics so that we provide something more than just data turned into datacards.

Free resources

There are free resources available on the internet. Other resources are provided by teachers who are sharing what they have done while teaching their own students. Resources provided for free can be of a high pedagogical standard. Having a high production standard, however, can be prohibitively expensive for individual producers who are working in their spare time.  It can also be tricky for another teacher to know what is suitable, and a lot of time can be spent trying to find high quality, reliable resources.

Teachers and resource providers – a symbiotic relationship

Teachers need good resource providers. It makes sense for experts to create high quality resources, drawing on current thinking with regard to content specific pedagogy. These can support teachers, particularly in areas in which they are less confident, such as statistics. And they do need to be paid for their work.

It helps when people recognise that our materials are sound and innovative, when they give us opportunities to contribute and when they include us at the decision-making table. Let us know how we can help you, and in partnership we can become better bed-fellows.

What do you think?


(Note that this post is also being published on our blog: Building a Statistics Learning  Community, as I felt it was important,)


Data for teaching – real, fake, fictional

There is a push for teachers and students to use real data in learning statistics. In this post I am going to address the benefits and drawbacks of different sources of real data, and make a case for the use of good fictional data as part of a statistical programme.

Here is a video introducing our fictional data set of 180 or 240 dragons, so you know what I am referring to.

Real collected, real database, trivial, fictional

There are two main types of real data. There is the real data that students themselves collect and there is real data in a dataset, collected by someone else, and available in its entirety. There are also two main types of unreal data. The first is trivial and lacking in context and useful only for teaching mathematical manipulation. The second is what I call fictional data, which is usually based on real-life data, but with some extra advantages, so long as it is skilfully generated. Poorly generated fictional data, as often found in case studies, is very bad for teaching.


When deciding what data to use for teaching statistics, it matters what it is that you are trying to teach. If you are simply teaching how to add up 8 numbers and divide the result by 8, then you are not actually doing statistics, and trivial fake data will suffice. Statistics only exists when there is a context. If you want to teach about the statistical enquiry process, then having the students genuinely involved at each stage of the process is a good idea. If you are particularly wanting to teach about fitting a regression line, you generally want to have multiple examples for students to use. And it would be helpful for there to be at least one linear relationship.

I read a very interesting article in “Teaching Children Mathematics” entitled, “Practıcal Problems: Using Literature to Teach Statistics”. The authors, Hourigan and Leavy, used a children’s book to generate the data on the number of times different characters appeared. But what I liked most, was that they addressed the need for a “driving question”. In this case the question was provided by a pre-school teacher who could only afford to buy one puppet for the book, and wanted to know which character appears the most in the story. The children practised collecting data as the story is read aloud. They collected their own data to analyse.

Let’s have a look at the different pros and cons of student-collected data, provided real data, and high-quality fictional data.

Collecting data

When we want students to experience the process of collecting real data, they need to collect real data. However real time data collection is time consuming, and probably not necessary every year. Student data collection can be simulated by a program such as The Islands, which I wrote about previously. Data students collect themselves is much more likely to have errors in it, or be “dirty” (which is a good thing). When students are only given clean datasets, such as those usually provided with textbooks, they do not learn the skills of deciding what to do with an errant data point. Fictional databases can also have dirty data, generated into it. The fictional inhabitants of The Islands sometimes lie, and often refuse to give consent for data collection on them.


One of the species of dragons included in our database

One of the species of dragons included in our database

I have heard that after a few years of school, graphs about cereal preference, number of siblings and type of pet get a little old. These topics, relating to the students, are motivating at first, but often there is no purpose to the investigation other than to get data for a graph.  Students need to move beyond their own experience and are keen to try something new. Data provided in a database can be motivating, if carefully chosen. There are opportunities to use databases that encourage awareness of social justice, the environment and politics. Fictional data must be motivating or there is no point! We chose dragons as a topic for our first set of fictional data, as dragons are interesting to boys and girls of most ages.

A meaningful  question

Here I refer again to that excellent article that talks about a driving question. There needs to be a reason for analysing the data. Maybe there is concern about food provided at the tuck shop, with healthy alternatives. Or can the question be tied into another area of the curriculum, such as which type of bean plant grows faster? Or can we increase the germination rate of seeds. The Census@school data has the potential for driving questions, but they probably need to be helped along. For existing datasets the driving question used by students might not be the same as the one (if any) driving the original collection of data. Sometimes that is because the original purpose is not ‘motivating’ for the students or not at an appropriate level. If you can’t find or make up a motivating meaningful question, the database is not appropriate. For our fictional dragon data, we have developed two scenarios – vaccinating for Pacific Draconian flu, and building shelters to make up for the deforestation of the island. With the vaccination scenario, we need to know about behaviour and size. For the shelter scenario we need to make decisions based on size, strength, behaviour and breath type. There is potential for a number of other scenarios that will also create driving questions.

Getting enough data

It can be difficult to get enough data for effects to show up. When students are limited to their class or family, this limits the number of observations. Only some databases have enough observations in them. There is no such problem with fictional databases, as you can just generate as much data as you need! There are special issues with regard to teaching about sampling, where you would want a large database with constrained access, like the Islands data, or the use of cards.


A problem with the data students collect is that it tends to be categorical, which limits the types of analysis that can be used. In databases, it can also be difficult to find measurement level data. In our fictional dragon database, we have height, strength and age, which all take numerical values. There are also four categorical variables. The Islands database has a large number of variables, both categorical and numerical.

Interesting Effects

Though it is good for students to understand that quite often there is no interesting effect, we would like students to have the satisfaction of finding interesting effects in the data, especially at the start. Interesting effects can be particularly exciting if the data is real, and they can apply their findings to the real world context. Student-collected-data is risky in terms of finding any noticeable relationships. It can be disappointing to do a long and involved study and find no effects. Databases from known studies can provide good effects, but unfortunately the variables with no effect tend to be left out of the databases, giving a false sense that there will always be effects. When we generate our fictional data, we make sure that there are the relationships we would like there, with enough interaction and noise. This is a highly skilled process, honed by decades of making up data for student assessment at university. (Guilty admission)


There are ethical issues to be addressed in the collection of real data from people the students know. Informed consent should be granted, and there needs to be thorough vetting. Young students (and not so young) can be damagingly direct in their questions. You may need to explain that it can be upsetting for people to be asked if they have been beaten or bullied. When using fictional data, that may appear real, such as the Islands data, it is important for students to be aware that the data is not real, even though it is based on real effects. This was one of the reasons we chose to build our first database on dragons, as we hope that will remove any concerns about whether the data is real or not!

The following table summarises the post.

Real data collected by the students Real existing database Fictional data
(The Islands, Kiwi Kapers, Dragons, Desserts)
Data collection Real experience Nil Sometimes
Dirty data Always Seldom Can be controlled
Motivating Can be Can be Must be!
Enough data Time consuming, difficult Hard to find Always
Meaningful question Sometimes. Can be trivial Can be difficult Part of the fictional scenario
Variables Tend towards nominal Often too few variables Generate as needed
Ethical issues Often Usually fine Need to manage reality
Effects Unpredictable Can be obvious or trivial, or difficult Can be managed

Divide and destroy in statistics teaching

A reductionist approach to teaching statistics destroys its very essence

I’ve been thinking a bit about systems thinking and reductionist thinking, especially with regard to statistics teaching and mathematics teaching. I used to teach a course on systems thinking, with regard to operations research. Systems thinking is concerned with the whole. The parts of the system interact and cannot be isolated without losing the essence of the system. Modern health providers and social workers realise that a child is a part of a family, which may be a part of a larger community, all of which have to be treated if the child is to be helped. My sister, a physio, always finds out about the home background of her patient, so that any treatment or exercise regime will fit in with their life. Reductionist thinking, by contrast, reduces things to their parts, and isolates them from their context.

Reductionist thinking in teaching mathematics

Mathematics teaching lends itself to reductionist thinking. You strip away the context, then break a problem down into smaller parts, solve the parts, and then put it all back together again. Students practise solving straight-forward problems over and over to make sure they can do it right. They feel that a column of little red ticks is evidence that they have learned something correctly. As a school pupil, I loved the columns of red ticks. I have written about the need for drill in some aspects of statistics teaching and learning, and can see the value of automaticity – or the ability to answer something without having to think too hard. That can be a little like learning a language – you need to be automatic on the vocabulary and basic verb structures. I used to spend my swimming training laps conjugating Latin verbs – amo, amas, amat (breathe), amamus, amatis, amant (breathe). I never did meet any ancient Romans to converse with, to see if my recitation had helped any, but five years of Latin vocab is invaluable in pub quizzes. But learning statistics has little in common with learning a language.

There is more to teaching than having students learn how to get stuff correct. Learning involves the mind, heart and hands. The best learning occurs when students actually want to know the answer. This doesn’t happen when context has been removed.

I was struck by Jo Boaler’s, “The Elephant in the Classroom”, which opened my eyes to how monumentally dull many mathematics lessons can be to so many people. These people are generally the ones who do not get satisfied by columns of red ticks, and either want to know more and ask questions, or want to be somewhere else. Holistic lessons, that involve group work, experiential learning, multiple solution methods and even multiple solutions, have been shown to improve mathematics learning and results, and have lifelong benefits to the students. The book challenged many of my ingrained feelings about how to teach and learn mathematics.

Teach statistics holistically, joyfully

Teaching statistics is inherently suited for a holistic approach. The problem must drive the model, not the other way around. Teachers of mathematics need to think more like teachers of social sciences if they are to capture the joy of teaching and learning statistics.

At one time I was quite taken with an approach suggested for students who are struggling, which is to go step-by-step through a number of examples in parallel and doing one step, before moving on to the next step. The examples I saw are great, and use real data, and the sentences are correct. I can see how that might appeal to students who are finding the language aspects difficult, and are interested in writing an assignment that will get them a passing grade. However I now have concerns about the approach, and it has made me think again about some of the resources we provide at Statistics Learning Centre. I don’t think a reductionist approach is suitable for the study of statistics.

Context, context, context

Context is everything in statistical analysis. Every time we produce a graph or a numerical result we should be thinking about the meaning in context. If there is a difference between the medians showing up in the graph, and reinforced by confidence intervals that do not overlap, we need to be thinking about what that means about the heart-rate in swimmers and non-swimmers, or whatever the context is. For this reason every data set needs to be real. We cannot expect students to want to find real meaning in manufactured data. And students need to spend long enough in each context in order to be able to think about the relationship between the model and the real-life situation. This is offset by the need to provide enough examples from different contexts so that students can learn what is general to all such models, and what is specific to each. It is a question of balance.

Keep asking questions

In my effort to help improve teaching of statistics, we are now developing teaching guides and suggestions to accompany our resources. I attend workshops, talk to teachers and students, read books, and think very hard about what helps all students to learn statistics in a holistic way. I do not begin to think I have the answers, but I think I have some pretty good questions. The teaching of statistics is such a new field, and so important. I hope we all keep asking questions about what we are teaching, and how and why.

Don’t teach significance testing – Guest post

The following is a guest post by Tony Hak of Rotterdam School of Management. I know Tony would love some discussion about it in the comments. I remain undecided either way, so would like to hear arguments.


It is now well understood that p-values are not informative and are not replicable. Soon null hypothesis significance testing (NHST) will be obsolete and will be replaced by the so-called “new” statistics (estimation and meta-analysis). This requires that undergraduate courses in statistics now already must teach estimation and meta-analysis as the preferred way to present and analyze empirical results. If not, then the statistical skills of the graduates from these courses will be outdated on the day these graduates leave school. But it is less evident whether or not NHST (though not preferred as an analytic tool) should still be taught. Because estimation is already routinely taught as a preparation for the teaching of NHST, the necessary reform in teaching will not require the addition of new elements in current programs but rather the removal of the current emphasis on NHST or the complete removal of the teaching of NHST from the curriculum. The current trend is to continue the teaching of NHST. In my view, however, teaching of NHST should be discontinued immediately because it is (1) ineffective and (2) dangerous, and (3) it serves no aim.

1. Ineffective: NHST is difficult to understand and it is very hard to teach it successfully

We know that even good researchers often do not appreciate the fact that NHST outcomes are subject to sampling variation and believe that a “significant” result obtained in one study almost guarantees a significant result in a replication, even one with a smaller sample size. Is it then surprising that also our students do not understand what NHST outcomes do tell us and what they do not tell us? In fact, statistics teachers know that the principles and procedures of NHST are not well understood by undergraduate students who have successfully passed their courses on NHST. Courses on NHST fail to achieve their self-stated objectives, assuming that these objectives include achieving a correct understanding of the aims, assumptions, and procedures of NHST as well as a proper interpretation of its outcomes. It is very hard indeed to find a comment on NHST in any student paper (an essay, a thesis) that is close to a correct characterization of NHST or its outcomes. There are many reasons for this failure, but obviously the most important one is that NHST a very complicated and counterintuitive procedure. It requires students and researchers to understand that a p-value is attached to an outcome (an estimate) based on its location in (or relative to) an imaginary distribution of sample outcomes around the null. Another reason, connected to their failure to understand what NHST is and does, is that students believe that NHST “corrects for chance” and hence they cannot cognitively accept that p-values themselves are subject to sampling variation (i.e. chance)

2. Dangerous: NHST thinking is addictive

One might argue that there is no harm in adding a p-value to an estimate in a research report and, hence, that there is no harm in teaching NHST, additionally to teaching estimation. However, the mixed experience with statistics reform in clinical and epidemiological research suggests that a more radical change is needed. Reports of clinical trials and of studies in clinical epidemiology now usually report estimates and confidence intervals, in addition to p-values. However, as Fidler et al. (2004) have shown, and contrary to what one would expect, authors continue to discuss their results in terms of significance. Fidler et al. therefore concluded that “editors can lead researchers to confidence intervals, but can’t make them think”. This suggests that a successful statistics reform requires a cognitive change that should be reflected in how results are interpreted in the Discussion sections of published reports.

The stickiness of dichotomous thinking can also be illustrated with the results of a more recent study of Coulson et al. (2010). They presented estimates and confidence intervals obtained in two studies to a group of researchers in psychology and medicine, and asked them to compare the results of the two studies and to interpret the difference between them. It appeared that a considerable proportion of these researchers, first, used the information about the confidence intervals to make a decision about the significance of the results (in one study) or the non-significance of the results (of the other study) and, then, drew the incorrect conclusion that the results of the two studies were in conflict. Note that no NHST information was provided and that participants were not asked in any way to “test” or to use dichotomous thinking. The results of this study suggest that NHST thinking can (and often will) be used by those who are familiar with it.

The fact that it appears to be very difficult for researchers to break the habit of thinking in terms of “testing” is, as with every addiction, a good reason for avoiding that future researchers come into contact with it in the first place and, if contact cannot be avoided, for providing them with robust resistance mechanisms. The implication for statistics teaching is that students should, first, learn estimation as the preferred way of presenting and analyzing research information and that they get introduced to NHST, if at all, only after estimation has become their routine statistical practice.

3. It serves no aim: Relevant information can be found in research reports anyway

Our experience that teaching of NHST fails its own aims consistently (because NHST is too difficult to understand) and the fact that NHST appears to be dangerous and addictive are two good reasons to immediately stop teaching NHST. But there is a seemingly strong argument for continuing to introduce students to NHST, namely that a new generation of graduates will not be able to read the (past and current) academic literature in which authors themselves routinely focus on the statistical significance of their results. It is suggested that someone who does not know NHST cannot correctly interpret outcomes of NHST practices. This argument has no value for the simple reason that it is assumed in the argument that NHST outcomes are relevant and should be interpreted. But the reason that we have the current discussion about teaching is the fact that NHST outcomes are at best uninformative (beyond the information already provided by estimation) and are at worst misleading or plain wrong. The point is all along that nothing is lost by just ignoring the information that is related to NHST in a research report and by focusing only on the information that is provided about the observed effect size and its confidence interval.


Coulson, M., Healy, M., Fidler, F., & Cumming, G. (2010). Confidence Intervals Permit, But Do Not Guarantee, Better Inference than Statistical Significance Testing. Frontiers in Quantitative Psychology and Measurement, 20(1), 37-46.

Fidler, F., Thomason, N., Finch, S., & Leeman, J. (2004). Editors Can Lead Researchers to Confidence Intervals, But Can’t Make Them Think. Statistical Reform Lessons from Medicine. Psychological Science, 15(2): 119-126.

This text is a condensed version of the paper “After Statistics Reform: Should We Still Teach Significance Testing?” published in the Proceedings of ICOTS9.


The Myth of Random Sampling

I feel a slight quiver of trepidation as I begin this post – a little like the boy who pointed out that the emperor has  no clothes.

Random sampling is a myth. Practical researchers know this and deal with it. Theoretical statisticians live in a theoretical world where random sampling is possible and ubiquitous – which is just as well really. But teachers of statistics live in a strange half-real-half-theoretical world, where no one likes to point out that real-life samples are seldom random.

The problem in general

In order for most inferential statistical conclusions to be valid, the sample we are using must obey certain rules. In particular, each member of the population must have equal possibility of being chosen. In this way we reduce the opportunity for systematic error, or bias. When a truly random sample is taken, it is almost miraculous how well we can make conclusions about the source population, with even a modest sample of a thousand. On a side note, if the general population understood this, and the opportunity for bias and corruption were eliminated, general elections and referenda could be done at much less cost,  through taking a good random sample.

However! It is actually quite difficult to take a random sample of people. Random sampling is doable in biology, I suspect, where seeds or plots of land can be chosen at random. It is also fairly possible in manufacturing processes. Medical research relies on the use of a random sample, though it is seldom of the total population. Really it is more about randomisation, which can be used to support causal claims.

But the area of most interest to most people is people. We actually want to know about how people function, what they think, their economic activity, sport and many other areas. People find people interesting. To get a really good sample of people takes a lot of time and money, and is outside the reach of many researchers. In my own PhD research I approximated a random sample by taking a stratified, cluster semi-random almost convenience sample. I chose representative schools of different types throughout three diverse regions in New Zealand. At each school I asked all the students in a class at each of three year levels. The classes were meant to be randomly selected, but in fact were sometimes just the class that happened to have a teacher away, as my questionnaire was seen as a good way to keep them quiet. Was my data of any worth? I believe so, of course. Was it random? Nope.

Problems people have in getting a good sample include cost, time and also response rate. Much of the data that is cited in papers is far from random.

The problem in teaching

The wonderful thing about teaching statistics is that we can actually collect real data and do analysis on it, and get a feel for the detective nature of the discipline. The problem with sampling is that we seldom have access to truly random data. By random I am not meaning just simple random sampling, the least simple method! Even cluster, systematic and stratified sampling can be a challenge in a classroom setting. And sometimes if we think too hard we realise that what we have is actually a population, and not a sample at all.

It is a great experience for students to collect their own data. They can write a questionnaire and find out all sorts of interesting things, through their own trial and error. But mostly students do not have access to enough subjects to take a random sample. Even if we go to secondary sources, the data is seldom random, and the students do not get the opportunity to take the sample. It would be a pity not to use some interesting data, just because the collection method was dubious (or even realistic). At the same time we do not want students to think that seriously dodgy data has the same value as a carefully collected random sample.

Possible solutions

These are more suggestions than solutions, but the essence is to do the best you can and make sure the students learn to be critical of their own methods.

Teach the best way, pretend and look for potential problems.

Teach the ideal and also teach the reality. Teach about the different ways of taking random samples. Use my video if you like!

Get students to think about the pros and cons of each method, and where problems could arise. Also get them to think about the kinds of data they are using in their exercises, and what biases they may have.

We also need to teach that, used judiciously, a convenience sample can still be of value. For example I have collected data from students in my class about how far they live from university , and whether or not they have a car. This data is not a random sample of any population. However, it is still reasonable to suggest that it may represent all the students at the university – or maybe just the first year students. It possibly represents students in the years preceding and following my sample, unless something has happened to change the landscape. It has worth in terms of inference. Realistically, I am never going to take a truly random sample of all university students, so this may be the most suitable data I ever get.  I have no doubt that it is better than no information.

All questions are not of equal worth. Knowing whether students who own cars live further from university, in general, is interesting but not of great importance. Were I to be researching topics of great importance, such safety features in roads or medicine, I would have a greater need for rigorous sampling.

So generally, I see no harm in pretending. I use the data collected from my class, and I say that we will pretend that it comes from a representative random sample. We talk about why it isn’t, but then we move on. It is still interesting data, it is real and it is there. When we write up analysis we include critical comments with provisos on how the sample may have possible bias.

What is important is for students to experience the excitement of discovering real effects (or lack thereof) in real data. What is important is for students to be critical of these discoveries, through understanding the limitations of the data collection process. Consequently I see no harm in using non-random, realistic sampled real data, with a healthy dose of scepticism.