# What is the statistical enquiry cycle and why is it a cycle? Is it really a cycle?

The New Zealand curriculum for Mathematics and statistics was recently held up as an example of good practice with regard to statistics. Yay us! In New Zealand the learning of statistics starts at the beginning of schooling and is part of the curriculum right through the school years. Statistics is developed as a discipline alongside mathematics, rather than as a subset of it. There are mathematics teachers who view this as an aberration, and believe that when this particular fad is over statistics will go back where it belongs, tucked quietly behind measurement, algebra and arithmetic. But the statisticians rejoice that the rich and exciting world of real data and detective work is being opened up to early learners. The outcome for mathematics and statistics remains to be seen.

A quick look over the Australian curriculum shows ostensibly a similar emphasis with regard to content at most levels. The big difference (at first perusal) is that the New Zealand curriculum has two strands of statistics – statistical investigation, and statistical literacy, whereas the Australian curriculum has the more mathematical approach of “Data representation and interpretation”. Both include probability as another strand.

## Data Detective Cycle

In the New Zealand curriculum, the statistical investigation strand at every level refers to the “Statistical enquiry cycle”, shown here, which is also known as the PPDAC cycle. This is a unifying theme and organising framework for teachers and learners.

This link takes you to a fuller explanation of the statistical enquiry cycle and its role at the different levels of the school curriculum. Note that the levels do not correspond to years. Click here to see the correspondence. The first five levels correspond to about 2 years each, whereas levels 6,7 and 8 correspond to the final three years of high school. So a child working on level 3 is generally aged about 10 or 11.

As I provide resources to support teaching and learning within the NZ curriculum I have become more aware of this framework, and have some questions and suggestions. I have made a table from which I hope to develop another diagram that students at higher levels can engage with, particularly with regard to the reporting aspects. As this is a work in progress you will have to wait!

## Origins

Let’s look at the origins of the diagram and terminology. Maxine Pfannkuch (an educator) worked with Chris Wild (a statistician) to articulate what it is that statisticians do. They published their results in the international statistical review in 1999 and contributed the chapter “Towards an understanding of statistical thinking” in “The Challenge of Developing Statistical Literacy, Reasoning and Thinking”, edited by Dani Ben-Zvi and Joan Garfield. The statistical enquiry cycle has consequently been promulgated in the diagram and description referred to above. There is sound research behind this, and it makes good sense as a way of explaining what statisticians do.

## Diagrams

I love diagrams. Anyone who has viewed my videos will know this. I spend a great deal of mental energy (usually while running) trying to work out ways to convey ideas in a visual way that will help people to learn, understand and remember. I also do NOT believe in the fad of learning styles, but rather I believe that all learners will gain from different presentations of concepts. I also believe that it is a useful discipline for a teacher to create different ways of expressing concepts. I am rather fussy about diagrams, however, as our Honours students would attest. I have a particular problem with arrows which mean different things in different places. If an arrow denotes passage of time in one instance it should do so in all instances, or a different style of arrow should be employed.

## No way in or out

A problem I have with the PPDAC “Cycle” being a cycle is that it seems to imply that we can come in at any point and that there is no escape. If there is a logical starting point, and the link back to it is not one of process, then that should be indicated. Because the arrows are all the same style in the PPDAC diagram, it is also difficult to see a way out of the cycle. As a learner I would find it a little daunting to think that I could never escape! I am also concerned about understanding in what way does a Conclusion lead to a Problem? Surely the whole point of the word “Conclusion” is that it concludes or ends something?

To me there are at least three linkages between the Problem and the Conclusion. First of all, while in the Problem stage, we need to think about what we want to be able to say in the future Conclusion stage. We may not know which way our conclusion will go, though we will probably have an opinion, or even a hope! (I am too post-modern in my thinking to believe in the objectivity of the researcher.) For instance we may want to be able to say – There is (or is not) evidence that women own more pairs of shoes than men. Another linkage is that when we write up our conclusion we must refer back to the original problem. And the third linkage comes from a comment Jean Thompson made on my blog about teaching time series without many computers. “Often the answer from a good statistical analysis is more questions”. One conclusion can lead to a new problem.

I found a similar diagram online which is more sequential, starting with the problem and working vertically through the steps, with a link at the end going back to the beginning. I like this, because it does give an idea of conclusion and moving on, rather than being caught in some endless cycle. The reality for students is that they will generally do some project, which will start with a problem and end with a conclusion. Then they will move on to an unrelated project. It has also been my experience as a practitioner.

In my experience the cyclical behaviour which this diagram portrays is generally more within the cycle than over the whole cycle. For instance one may be part way through the data collection and realise that it isn’t going to work, and go back to the “Plan” stage. Some of these extra loops are suggested in my table.

## Reporting

For students at a higher level who are required to write reports, it is difficult to see how the report fits in with the cycle. The “Conclusion” step includes “communication”, which could imply a report. However reports often include most of the steps, particularly when their purpose is to satisfy an assessment requirement.

## Existing datasets

It is also difficult to apply the cycle in a non-cynical way to work with existing datasets. Often, in the interests of time and quality control, students are given a dataset. In reality they start, not at the Problem step, but somewhere between the Data step and the Analysis step. In their assessments they are required to read around the topic and use their imaginations to come up with the problem, look at how the data was collected, and move on from there. This is not always the case, but it is for NCEA level 3 Bivariate Investigation, Time Series analysis and Formal Inference areas (called ‘standards’). The only area where they really do plan and collect the data is in the Experimental Design standard. Might it not be helpful to provide an adapted plan that takes into account these exigencies? Let us be explicit about it rather than coyly pretend that the data wasn’t driving everything?

In general I like the concept of the statistical enquiry cycle, and I am happy that it is providing a unifying theme to the curriculum. However, particularly at higher levels, I think it needs a bit of tweaking, taking into account the experience of teachers and learners. If it is to hold such an important place in a curriculum that is leading the world, it deserves on-going attention.

## Disclaimer

This is a blog and not an academic journal. The ideas I have contemplated need a lot more thought and background reading, but I do not have the time or the university salary to support such a luxury right now. Maybe someone else does!

I’m inclined to agree with you here. There’s good stuff in the cycle – I especially applaud planning before data acquisition and analysis – but it clearly doesn’t need to be a cycle. For a start, one doesn’t come back to the _same_ problem, as an over-literal interpretation would suggest. Nor do conclusions necessarily generate a new problem; in fact ideally they don’t.

Bottom line: I’d see each case as a process with a reasonably well defined endpoint, not a cycle. And indeed, that’s the way we manage routine statistical analysis in my stats group; one ‘call’, one problem, and one report, reviewed, checked and signed off. A new problem is a new call with its own report, never a continuation.

This bears a strong resemblance to something called a Deming cycle, commonly used in describing what statistical quality improvement does. Deming called it a Shewhart cycle, after his colleague and the originator of some of those ideas.

In that context, the cycle makes more sense. It makes less sense in the context of scientific-statistical projects that start with a question, proceed through design and analysis, and end with conclusions and reporting. Such projects occur in quality improvement, but projects also end in changes to a system of production which requires ongoing monitoring and change, designed to be improvement.

The Shewhart or Deming cycle is a way of describing the ongoing efforts to analyze a system, change it, then study the results of the changes. It differs from the chart you show in not having a Conclusion step. Conclusions exist, but they are built into a system of production and the results of the changes get studied and evaluated again.

It’s a cycle because there is no sharp end to the whole thing.

Pingback: The silent dog – null results matter too! | Learn and Teach Statistics and Operations Research