Why people hate statistics

This summer/Christmas break it has been my pleasure to help a young woman who is struggling with statistics, and it has prompted me to ask people who teach postgraduate statistical methods – WTF are you doing?

Louise (name changed) is a bright, hard-working young woman, who has finished an undergraduate degree at a prestigious university and is now doing a Masters degree at a different prestigious university, which is a long way from where I live and will remain nameless. I have been working through her lecture slides, past and future and attempting to develop in her some confidence that she will survive the remainder of the course, and that statistics is in fact fathomable.

Incomprehensible courses alienating research students

After each session with Louise I have come away shaking my head and wondering what this lecturer is up to. I wonder if he/she really understands statistics or is just passing on their own confusion. And the very sad thing is that I KNOW that there are hundreds of lecturers in hundreds of similar courses around the world teaching in much the same way and alienating thousands of students every year.

And they need to stop.

Here is the approach: You have approximately eight weeks, made up of four hour sessions, in which to teach your masters students everything they could possibly need to know about statistics. So you tell them everything! You use technical terms with little explanation, and you give no indication of what is important and what is background. You dive right in with no clear purpose, and you expect them to keep up.

Choosing your level

Frequently Louise would ask me to explain something and I would pause to think. I was trying to work out how deep to go. It is like when a child asks where babies come from. They may want the full details, but they may not, and you need to decide what level of answer is most appropriate. Anyone who has seen our popular YouTube videos will be aware that I encourage conceptual understanding at best, and the equivalent of a statistics drivers licence at worst. When you have eight weeks to learn everything there is to know about statistics, up to and including multiple regression, logistic regression, GLM, factor analysis, non-parametric methods and more, I believe the most you can hope for is to be able to get the computer to run the test, and then make intelligent conclusions about the output.

There was nothing in the course about data collection, data cleaning, the concept of inference or the relationship between the model and reality. My experience is that data cleaning is one of the most challenging parts of analysis, especially for novice researchers.

Use learning objectives

And maybe one of the worst problems with Louise’s course was that there were no specific learning objectives. One of my most popular posts is on the need for learning objectives. Now I am not proposing that we slavishly tell students in each class what it is they are to learn, as that can be tedious and remove the fun from more discovery style learning. What I am saying is that it is only fair to tell the students what they are supposed to be learning. This helps them to know what in the lecture is important, and what is background. They need to know whether they need to have a passing understanding of a test, or if they need to be able to run one, or if they need to know the underlying mathematics.

Take for example, the t-test. There are many ways that the t-statistic can be used, so simply referring to a test as a t-test is misleading before you even start. And starting your teaching with the statistic is not helpful. We need to start with the need! I would call it a test for the difference of two means from two groups. And I would just talk about the t statistic in passing. I would give examples of output from various scenarios, some of which reject the null, some of which don’t and maybe even one that has a p-value of 0.049 so we can talk about that. In each case we would look at how the context affects the implications of the test result. In my learning objectives I would say: Students will be able to interpret the output of a test for the difference of two means, putting the result in context. And possibly, Students will be able to identify ways in which a test for the difference of two means violates the assumptions of a t-test. Now that wasn’t hard was it?

Like driving a car

Louise likes to understand where things come from, so we did go through an overview of how various distributions have been found to model different aspects of the world well – starting with the normal distribution, and with a quick jaunt into the Central Limit Theorem. I used my Dragonistics data cards, which were invented for teaching primary school, but actually work surprisingly well at all levels! I can’t claim that Louise understands the use of the t distribution, but I hope she now believes in it. I gave her the analogy of learning to drive – that we don’t need to know what is happening under the bonnet to be a safe driver. In fact safe driving depends more on paying attention to the road conditions and human behaviour.

Assumptions

Louise tells me that her lecturer emphasises assumptions – that the students need to examine them all, every time they look at or perform a statistical test. Now I have no problems with this later on, but students need to have some idea of where they are going and why, before being told what luggage they can and can’t take. And my experience is that assumptions are always violated. Always. As George Box put it – “All models are wrong and some models are useful.”

It did not help that the lecturer seemed a little confused about the assumption of normality. I am not one to point the finger, as this is a tricky assumption, as the Andy Field textbook pointed out. For example, we do not require the independent variables in a multiple regression to be normally distributed as the lecturer specified. This is not even possible if we are including dummy variables. What we do need to watch out for is that the residuals are approximately modelled by a normal distribution, and if not, that we do something about it.

You may have gathered that my approach to statistics is practical rather than idealistic. Why get all hot and bothered about whether you should do a parametric or non-parametric test, when the computer package does both with ease, and you just need to check if there is any difference in the result. (I can hear some purists hyperventilating at this point!) My experience is that the results seldom differ.

What post-graduate statistical methods courses should focus on

Instructors need to concentrate on the big ideas of statistics – what is inference, why we need data, how a sample is collected matters, and the relationship between a model and the reality it is modelling. I would include the concept of correlation, and its problematic link to causation. I would talk about the difference between statistical significance and usefulness, and evidence and strength of a relationship. And I would teach students how to find the right fishing lessons! If a student is critiquing a paper that uses logistical regression, that is the time they need to read up enough about logistical regression to be able to understand what they are reading.They cannot possibly learn a useful amount about all the tests or methods that they may encounter one day.

If research students are going to be doing their own research, they need more than a one semester fly-by of techniques, and would be best to get advice from a statistician BEFORE they collect the data.

Final word

So here is my take-home message:

Stop making graduate statistical methods courses so outrageously difficult by cramming them full of advanced techniques and concepts. Instead help students to understand what statistics is about, and how powerful and wonderful it can be to find out more about the world through data.

Your word

Am I right or is my preaching of the devil? Please add your comments below.

Advertisements

Data for teaching – real, fake, fictional

There is a push for teachers and students to use real data in learning statistics. In this post I am going to address the benefits and drawbacks of different sources of real data, and make a case for the use of good fictional data as part of a statistical programme.

Here is a video introducing our fictional data set of 180 or 240 dragons, so you know what I am referring to.

Real collected, real database, trivial, fictional

There are two main types of real data. There is the real data that students themselves collect and there is real data in a dataset, collected by someone else, and available in its entirety. There are also two main types of unreal data. The first is trivial and lacking in context and useful only for teaching mathematical manipulation. The second is what I call fictional data, which is usually based on real-life data, but with some extra advantages, so long as it is skilfully generated. Poorly generated fictional data, as often found in case studies, is very bad for teaching.

Focus

When deciding what data to use for teaching statistics, it matters what it is that you are trying to teach. If you are simply teaching how to add up 8 numbers and divide the result by 8, then you are not actually doing statistics, and trivial fake data will suffice. Statistics only exists when there is a context. If you want to teach about the statistical enquiry process, then having the students genuinely involved at each stage of the process is a good idea. If you are particularly wanting to teach about fitting a regression line, you generally want to have multiple examples for students to use. And it would be helpful for there to be at least one linear relationship.

I read a very interesting article in “Teaching Children Mathematics” entitled, “Practıcal Problems: Using Literature to Teach Statistics”. The authors, Hourigan and Leavy, used a children’s book to generate the data on the number of times different characters appeared. But what I liked most, was that they addressed the need for a “driving question”. In this case the question was provided by a pre-school teacher who could only afford to buy one puppet for the book, and wanted to know which character appears the most in the story. The children practised collecting data as the story is read aloud. They collected their own data to analyse.

Let’s have a look at the different pros and cons of student-collected data, provided real data, and high-quality fictional data.

Collecting data

When we want students to experience the process of collecting real data, they need to collect real data. However real time data collection is time consuming, and probably not necessary every year. Student data collection can be simulated by a program such as The Islands, which I wrote about previously. Data students collect themselves is much more likely to have errors in it, or be “dirty” (which is a good thing). When students are only given clean datasets, such as those usually provided with textbooks, they do not learn the skills of deciding what to do with an errant data point. Fictional databases can also have dirty data, generated into it. The fictional inhabitants of The Islands sometimes lie, and often refuse to give consent for data collection on them.

Motivation

One of the species of dragons included in our database

One of the species of dragons included in our database

I have heard that after a few years of school, graphs about cereal preference, number of siblings and type of pet get a little old. These topics, relating to the students, are motivating at first, but often there is no purpose to the investigation other than to get data for a graph.  Students need to move beyond their own experience and are keen to try something new. Data provided in a database can be motivating, if carefully chosen. There are opportunities to use databases that encourage awareness of social justice, the environment and politics. Fictional data must be motivating or there is no point! We chose dragons as a topic for our first set of fictional data, as dragons are interesting to boys and girls of most ages.

A meaningful  question

Here I refer again to that excellent article that talks about a driving question. There needs to be a reason for analysing the data. Maybe there is concern about food provided at the tuck shop, with healthy alternatives. Or can the question be tied into another area of the curriculum, such as which type of bean plant grows faster? Or can we increase the germination rate of seeds. The Census@school data has the potential for driving questions, but they probably need to be helped along. For existing datasets the driving question used by students might not be the same as the one (if any) driving the original collection of data. Sometimes that is because the original purpose is not ‘motivating’ for the students or not at an appropriate level. If you can’t find or make up a motivating meaningful question, the database is not appropriate. For our fictional dragon data, we have developed two scenarios – vaccinating for Pacific Draconian flu, and building shelters to make up for the deforestation of the island. With the vaccination scenario, we need to know about behaviour and size. For the shelter scenario we need to make decisions based on size, strength, behaviour and breath type. There is potential for a number of other scenarios that will also create driving questions.

Getting enough data

It can be difficult to get enough data for effects to show up. When students are limited to their class or family, this limits the number of observations. Only some databases have enough observations in them. There is no such problem with fictional databases, as you can just generate as much data as you need! There are special issues with regard to teaching about sampling, where you would want a large database with constrained access, like the Islands data, or the use of cards.

Variables

A problem with the data students collect is that it tends to be categorical, which limits the types of analysis that can be used. In databases, it can also be difficult to find measurement level data. In our fictional dragon database, we have height, strength and age, which all take numerical values. There are also four categorical variables. The Islands database has a large number of variables, both categorical and numerical.

Interesting Effects

Though it is good for students to understand that quite often there is no interesting effect, we would like students to have the satisfaction of finding interesting effects in the data, especially at the start. Interesting effects can be particularly exciting if the data is real, and they can apply their findings to the real world context. Student-collected-data is risky in terms of finding any noticeable relationships. It can be disappointing to do a long and involved study and find no effects. Databases from known studies can provide good effects, but unfortunately the variables with no effect tend to be left out of the databases, giving a false sense that there will always be effects. When we generate our fictional data, we make sure that there are the relationships we would like there, with enough interaction and noise. This is a highly skilled process, honed by decades of making up data for student assessment at university. (Guilty admission)

Ethics

There are ethical issues to be addressed in the collection of real data from people the students know. Informed consent should be granted, and there needs to be thorough vetting. Young students (and not so young) can be damagingly direct in their questions. You may need to explain that it can be upsetting for people to be asked if they have been beaten or bullied. When using fictional data, that may appear real, such as the Islands data, it is important for students to be aware that the data is not real, even though it is based on real effects. This was one of the reasons we chose to build our first database on dragons, as we hope that will remove any concerns about whether the data is real or not!

The following table summarises the post.

Real data collected by the students Real existing database Fictional data
(The Islands, Kiwi Kapers, Dragons, Desserts)
Data collection Real experience Nil Sometimes
Dirty data Always Seldom Can be controlled
Motivating Can be Can be Must be!
Enough data Time consuming, difficult Hard to find Always
Meaningful question Sometimes. Can be trivial Can be difficult Part of the fictional scenario
Variables Tend towards nominal Often too few variables Generate as needed
Ethical issues Often Usually fine Need to manage reality
Effects Unpredictable Can be obvious or trivial, or difficult Can be managed

The Myth of Random Sampling

I feel a slight quiver of trepidation as I begin this post – a little like the boy who pointed out that the emperor has  no clothes.

Random sampling is a myth. Practical researchers know this and deal with it. Theoretical statisticians live in a theoretical world where random sampling is possible and ubiquitous – which is just as well really. But teachers of statistics live in a strange half-real-half-theoretical world, where no one likes to point out that real-life samples are seldom random.

The problem in general

In order for most inferential statistical conclusions to be valid, the sample we are using must obey certain rules. In particular, each member of the population must have equal possibility of being chosen. In this way we reduce the opportunity for systematic error, or bias. When a truly random sample is taken, it is almost miraculous how well we can make conclusions about the source population, with even a modest sample of a thousand. On a side note, if the general population understood this, and the opportunity for bias and corruption were eliminated, general elections and referenda could be done at much less cost,  through taking a good random sample.

However! It is actually quite difficult to take a random sample of people. Random sampling is doable in biology, I suspect, where seeds or plots of land can be chosen at random. It is also fairly possible in manufacturing processes. Medical research relies on the use of a random sample, though it is seldom of the total population. Really it is more about randomisation, which can be used to support causal claims.

But the area of most interest to most people is people. We actually want to know about how people function, what they think, their economic activity, sport and many other areas. People find people interesting. To get a really good sample of people takes a lot of time and money, and is outside the reach of many researchers. In my own PhD research I approximated a random sample by taking a stratified, cluster semi-random almost convenience sample. I chose representative schools of different types throughout three diverse regions in New Zealand. At each school I asked all the students in a class at each of three year levels. The classes were meant to be randomly selected, but in fact were sometimes just the class that happened to have a teacher away, as my questionnaire was seen as a good way to keep them quiet. Was my data of any worth? I believe so, of course. Was it random? Nope.

Problems people have in getting a good sample include cost, time and also response rate. Much of the data that is cited in papers is far from random.

The problem in teaching

The wonderful thing about teaching statistics is that we can actually collect real data and do analysis on it, and get a feel for the detective nature of the discipline. The problem with sampling is that we seldom have access to truly random data. By random I am not meaning just simple random sampling, the least simple method! Even cluster, systematic and stratified sampling can be a challenge in a classroom setting. And sometimes if we think too hard we realise that what we have is actually a population, and not a sample at all.

It is a great experience for students to collect their own data. They can write a questionnaire and find out all sorts of interesting things, through their own trial and error. But mostly students do not have access to enough subjects to take a random sample. Even if we go to secondary sources, the data is seldom random, and the students do not get the opportunity to take the sample. It would be a pity not to use some interesting data, just because the collection method was dubious (or even realistic). At the same time we do not want students to think that seriously dodgy data has the same value as a carefully collected random sample.

Possible solutions

These are more suggestions than solutions, but the essence is to do the best you can and make sure the students learn to be critical of their own methods.

Teach the best way, pretend and look for potential problems.

Teach the ideal and also teach the reality. Teach about the different ways of taking random samples. Use my video if you like!

Get students to think about the pros and cons of each method, and where problems could arise. Also get them to think about the kinds of data they are using in their exercises, and what biases they may have.

We also need to teach that, used judiciously, a convenience sample can still be of value. For example I have collected data from students in my class about how far they live from university , and whether or not they have a car. This data is not a random sample of any population. However, it is still reasonable to suggest that it may represent all the students at the university – or maybe just the first year students. It possibly represents students in the years preceding and following my sample, unless something has happened to change the landscape. It has worth in terms of inference. Realistically, I am never going to take a truly random sample of all university students, so this may be the most suitable data I ever get.  I have no doubt that it is better than no information.

All questions are not of equal worth. Knowing whether students who own cars live further from university, in general, is interesting but not of great importance. Were I to be researching topics of great importance, such safety features in roads or medicine, I would have a greater need for rigorous sampling.

So generally, I see no harm in pretending. I use the data collected from my class, and I say that we will pretend that it comes from a representative random sample. We talk about why it isn’t, but then we move on. It is still interesting data, it is real and it is there. When we write up analysis we include critical comments with provisos on how the sample may have possible bias.

What is important is for students to experience the excitement of discovering real effects (or lack thereof) in real data. What is important is for students to be critical of these discoveries, through understanding the limitations of the data collection process. Consequently I see no harm in using non-random, realistic sampled real data, with a healthy dose of scepticism.

Those who can, teach statistics

The phrase I despise more than any in popular use (and believe me there are many contenders) is “Those who can, do, and those who can’t, teach.” I like many of the sayings of George Bernard Shaw, but this one is dismissive, and ignorant and born of jealousy. To me, the ability to teach something is a step higher than being able to do it. The PhD, the highest qualification in academia, is a doctorate. The word “doctor” comes from the Latin word for teacher.

Teaching is a noble profession, on which all other noble professions rest. Teachers are generally motivated by altruism, and often go well beyond the requirements of their job-description to help students. Teachers are derided for their lack of importance, and the easiness of their job. Yet at the same time teachers are expected to undo the ills of society. Everyone “knows” what teachers should do better. Teachers are judged on their output, as if they were the only factor in the mix. Yet how many people really believe their success or failure is due only to the efforts of their teacher?

For some people, teaching comes naturally. But even then, there is the need for pedagogical content knowledge. Teaching is not a generic skill that transfers seamlessly between disciplines. You must be a thinker to be a good teacher. It is not enough to perpetuate the methods you were taught with. Reflection is a necessary part of developing as a teacher. I wrote in an earlier post, “You’re teaching it wrong”, about the process of reflection. Teachers need to know their material, and keep up-to-date with ways of teaching it. They need to be aware of ways that students will have difficulties. Teachers, by sharing ideas and research, can be part of a communal endeavour to increase both content knowledge and pedagogical content knowledge.

There is a difference between being an explainer and being a teacher. Sal Khan, maker of the Khan Academy videos, is a very good explainer. Consequently many students who view the videos are happy that elements of maths and physics that they couldn’t do, have been explained in such a way that they can solve homework problems. This is great. Explaining is an important element in teaching. My own videos aim to explain in such a way that students make sense of difficult concepts, though some videos also illustrate procedure.

Teaching is much more than explaining. Teaching includes awakening a desire to learn and providing the experiences that will help a student to learn.  In these days of ever-expanding knowledge, a content-driven approach to learning and teaching will not serve our citizens well in the long run. Students need to be empowered to seek learning, to criticize, to integrate their knowledge with their life experiences. Learning should be a transformative experience. For this to take place, the teachers need to employ a variety of learner-focussed approaches, as well as explaining.

It cracks me up, the way sugary cereals are advertised as “part of a healthy breakfast”. It isn’t exactly lying, but the healthy breakfast would do pretty well without the sugar-filled cereal. Explanations really are part of a good learning experience, but need to be complemented by discussion, participation, practice and critique.  Explanations are like porridge – healthy, but not a complete breakfast on their own.

Why statistics is so hard to teach

“I’m taking statistics in college next year, and I can’t wait!” said nobody ever!

Not many people actually want to study statistics. Fortunately many people have no choice but to study statistics, as they need it. How much nicer it would be to think that people were studying your subject because they wanted to, rather than because it is necessary for psychology/medicine/biology etc.

In New Zealand, with the changed school curriculum that gives greater focus to statistics, there is a possibility that one day students will be excited to study stats. I am impressed at the way so many teachers have embraced the changed curriculum, despite limited resources, and late changes to assessment specifications. In a few years as teachers become more familiar with and start to specialise in statistics, the change will really take hold, and the rest of the world will watch in awe.

In the meantime, though, let us look at why statistics is difficult to teach.

  1. Students generally take statistics out of necessity.
  2. Statistics is a mixture of quantitative and communication skills.
  3. It is not clear which are right and wrong answers.
  4. Statistical terminology is both vague and specific.
  5. It is difficult to get good resources, using real data in meaningful contexts.
  6. One of the basic procedures, hypothesis testing, is counter-intuitive.
  7. Because the teaching of statistics is comparatively recent, there is little developed pedagogical content knowledge. (Though this is growing)
  8. Technology is forever advancing, requiring regular updating of materials and teaching approaches.

On the other hand, statistics is also a fantastic subject to teach.

  1. Statistics is immediately applicable to life.
  2. It links in with interesting and diverse contexts, including subjects students themselves take.
  3. Studying statistics enables class discussion and debate.
  4. Statistics is necessary and does good.
  5. The study of data and chance can change the way people see the world.
  6. Technlogical advances have put the power for real statistical analysis into the hands of students.
  7. Because the teaching of statistics is new, individuals can make a difference in the way statistics is viewed and taught.

I love to teach. These days many of my students are scattered over the world, watching my videos (for free) on YouTube. It warms my heart when they thank me for making something clear, that had been confusing. I realise that my efforts are small compared to what their teacher is doing, but it is great to be a part of it.

How to study statistics (Part 1)

To students of statistics

Most of my posts are directed at teachers and how to teach statistics. The blog this week and next is devoted to students. I present principles that will help you to learn statistics. I’m turning them into a poster, which I will make available for you to printing later. I’d love to hear from other teachers as I add to my list of principles.

1. Statistics is learned by doing

One of the best predictors of success in any subject is how much time you spent on it. If you want to learn statistics, you need to put in time. It is good to read the notes and the textbook, and to look up things on the internet and even to watch Youtube videos if they are good ones. But the most important way to learn statistics is by doing. You need to practise at the skills that are needed by a statistician, which include logical thinking, interpretation, judgment and writing. Your teacher should provide you with worthwhile practice activities, and helpful timely feedback. Good textbooks have good practice exercises. On-line materials have many practice exercises.

Given a choice, do the exercises that have answers available. It is very important that you check what you are doing, as it is detrimental to practise something in the wrong way. Or if you are using an on-line resource, make sure you check your answers as you go, so that you gain from the feedback and avoid developing bad habits.

So really the first principle should really be “statistics is learned by doing correctly.

2. Understanding comes with application, not before.

Do not wait until you understand what you are doing before you get started. The understanding comes as you do the work. When we learn to speak, we do not wait until we understand grammatical structure before saying anything. We use what we have to speak and to listen, and as we do so we gain an understanding of how language works.  I have found that students who spent a lot of time working through the process of calculating conditional probabilities for screening tests grew to understand the “why” as well as the “how” of the process. Repeated application of using Excel to fit a line to bivariate data and explaining what it meant, enabled students to understand and internalise what a line means. As I have taught statistics for two decades, my own understanding has continued to grow.

There is a proviso. You need to think about what you are doing, and you need to do worthwhile exercises. For example, mechanically calculating the standard deviation of a set of numbers devoid of context will not help you understand standard deviation. Looking at graphs and trying to guess what the standard deviation is, would be a better exercise. Then applying the value to the context is better still.

Applying statistical principles to a wide variety of contexts helps us to discern what is specific to a problem and what is general for all problems. This brings us to the next principle.

3. Spend time exploring the context.

In a statistical analysis, context is vital, and often very interesting. You need to understand the problem that gave rise to the investigation and collection of the data. The context is what makes each statistical investigation different. Statisticians often work alongside other researchers in areas such as medicine, psychology, biology and geology, who provide the contextual background to the problem. This provides a wonderful opportunity for the statistician to learn about a whole range of different subjects. The interplay between the data and context mean that every investigation is different.

In a classroom setting you will not have the subject expert available, but you do need to understand the story behind the data. These days, finding out is possible with a click of a Google or Wikipedia button. Knowing the background to the data helps you to make more sensible judgments – and it makes it more interesting.

4. Statistics is different from mathematics

In mathematics, particularly pure mathematics, context is stripped away in order to reveal the inner pure truth of numbers and logic.  There are applied areas involving mathematics, which are more like statistics, such as operations research and engineering. At school level, one of the things that characterises the study of maths is right and wrong answers, with a minimum of ambiguity. That is what I loved about mathematics – being able to apply an algorithm and get a correct answer. In statistics, however, things are seldom black-and-white.  In statistics you will need to interpret data from the perspective of the real world, and often the answer is not clear. Some people find the lack of certainty in statistics disturbing. There is considerable room for discussion in statistics. Some aspects of statistics are fuzzy, such as what to do with messy data, or which is the “best” model to fit a time series. There is a greater need for the ability to write in statistics, which makes if more challenging for students for whom English is not their native language.

5. Technology is essential

With computers and calculators, all sorts of activities are available to help learn statistics. Graphs and graphics enable exploration that was not possible when graphs had to be drawn by hand. You can have a multivariate data set and explore all the possible relationships with a few clicks. You should always look at the data in a graphical form before setting out to analyse.

Sometimes I would set optional exercises for students to explore the relationship between data, graphs and summary measures. Very few students did so, but when I led them through the same examples one at a time I could see the lights go on. When you are given opportunities to use computing power to explore and learn – do it!

But wait…there’s more

Here we have the first five principles for students learning statistics. Watch this space next week for some more. And do add some in the comments and I will try to incorporate your ideas as well.

Context – if it isn’t fun…

The role of context in statistical analysis

The wonderful advantage of teaching statistics is the real-life context within which any applicaton must exist. This can also be one of the difficulties. Statistics without context is merely the mathematics of statistics, and is sterile and theoretical.  The teaching of statistics requires real data. And real data often comes with a fairly solid back-story.

One of the interesting aspects for practicing statisticians, is that they can find out about a wide range of applications, by working in partnership with specialists. In my statistical and operations research advising I have learned about a range of subjects, including the treatment of hand injuries, children’s developmental understanding of probability, the bed occupancy in public hospitals, the educational needs of blind students, growth rates of vegetables, texted comments on service at supermarkets, killing methods of chickens, rogaine route choice, co-ordinating scientific expeditions to Antarctica and the cost of care for neonatals in intensive care. I found most of these really interesting and was keen to work with the experts on these projects. Statisticians tend to work in teams with specialists in related disciplines.

Learning a context can take time

When one is part of a long-term project, time spent learning the intricacies of the context is well spent. Without that, the meaning from the data can be lost. However, it is difficult to replicate this in the teaching of statistics, particularly in a general high school or service course. The amount of time required to become familiar with the context takes away from the time spent learning statistics. Too much time spent on one specific project or area of interest can mean that the students are unable to generalise. You need several different examples in order to know what is specific to the context and what is general to all or most contexts.

One approach is to try to have contexts with which students are already familiar. This can be enabled by collecting the data from the students themselves. The Census at School project provides international data for students to use in just this way. This is ideal, in that the context is familiar, and yet the data is “dirty” enough to provide challenges and judgment calls.

Some teachers find that this is too low-level and would prefer to use biological data, or dietary or sports data from other sources. I have some reservations about this. In New Zealand the new statistics curriculum is in its final year of introduction, and understandably there are some bedding-in issues. One I perceive is the relative importance of the context in the students’ reports. As these reports have high-stakes grades attached to them, this is an issue. I will use as an example the time series “standard”. The assessment specification states, among other things, “Using the statistical enquiry cycle to investigate time series data involves: using existing data sets, selecting a variable to investigate, selecting and using appropriate display(s), identifying features in the data and relating this to the context, finding an appropriate model, using the model to make a forecast, communicating findings in a conclusion.”

The full “standard” is given here: Investigate Time Series Data This would involve about five weeks of teaching and assessment, in parallel with four other subjects.(The final 3 years of schooling in NZ are assessed through the National Certificate of Educational Achievement (NCEA). Each year students usually take five subject areas, each of which consists of about six “achievement standards” worth between 3 and 6 credits. There is a mixture of internally and externally assessed standards.)

In this specification I see that there is a requirement for the model to be related to the context. This is a great opportunity for teachers to show how models are useful, and their limitations. I would be happy with a few sentences indicating that the student could identify a seasonal pattern and make some suggestions as to why this might relate to the context, followed by a similar analysis of the shape of the trend. However there are some teachers who are requiring students to do independent literature exploration into the area, and requiring references, while forbidding the referencing of Wikipedia.

This concerns me, and I call for robust discussion.

Statistics is not research methods any more than statistics is mathematics. Research methods and standards of evidence vary between disciplines. Clearly the evidence required in medical research will differ from that of marketing research. I do not think it is the place of the statistics teacher to be covering this. Mathematics teachers are already being stretched to teach the unfamiliar material of statistics, and I think asking them and the students to become expert in research methods is going too far.

It is also taking out all the fun.

Keep the fun

Statistics should be fun for the teacher and the students. The context needs to be accessible or you are just putting in another opportunity for antipathy and confusion. If you aren’t having fun, you aren’t doing it right. Or, more to the point, if your students aren’t having fun, you aren’t doing it right.

Some suggestions about the role of context in teaching statistics and operations research

  • Use real data.
  • If the context is difficult to understand, you are losing the point.
  • The results should not be obvious. It is not interesting that year 12 boys weigh more than year 9 boys.
  • Null results are still results. (We aren’t trying for academic publications!)
  • It is okay to clean up data so you don’t confuse students before they are ready for it.
  • Sometimes you should use dirty data – a bit of confusion is beneficial.
  • Various contexts are better than one long project.
  • Avoid the plodding parts of research methods.
  • Avoid boring data. Who gives a flying fish about the relative sizes of dolphin jaws?
  • Wikipedia is a great place to find out the context for most high school statistics analysis. That is where I look. It’s a great starting place for anyone.

Interpreting Scatterplots

Patterns, vocab and practice, practice, practice

An important part of statistical analysis is being able to look at graphical representation of data, extract  meaning and make comments about it, particularly related to the context. Graph interpretation is a difficult skill to teach as there is no clear algorithm, such as mathematics teachers are used to teaching, and the answers are far from clear-cut.

This post is about the challenges of teaching scatterplot interpretation, with some suggestions.

When undertaking an investigation of bivariate measurement data, a scatterplot is the graph to use. On a scatterplot we can see what shape the data seems to have, what direction the relationship goes in, how close the points are to the line, if there are clear groups and if there are unusual observations.

The problem is that when you know what to look for, spurious effects don’t get in the way, but when you don’t know what to look for, you don’t know what is spurious. This can be likened to a master chess player who can look at a game in play and see at a glance what is happening, whereas the novice sees only the individual pieces, and cannot easily tell where the action is taking place. What is needed is pattern recognition.

In addition, there is considerable room  for argument in interpreting scatterplots. What one person sees as a non-linear relationship, another person might see as a line with some unusual observations. My experience is that people tend to try for more complicated models than is sensible. A few unusual observations can affect how we see the graph. There is also a contextual content to the discussion. The nature of the individual observations, and the sample can make a big difference to the meaning drawn from the graph. For example, a scatterplot of the sodium content vs the energy content in food should not really have a strong relationship. However, if the sample of food taken is predominantly fast food, high sodium content is related to high fat content (salt on fries!) and this can appear to be a relationship. In the graph below, is there really a linear relationship, or is it just because of the choice of sample?

In a set of data about fast food, there appears to be a relationship between sodium content and energy.

In a set of data about fast food, there appears to be a relationship between sodium content and energy.

Students need to be exposed to a large number of different scatterplots, Fortunately this is now possible, thanks to computers. Students should not be drawing graphs by hand.

So how do we teach this? I think about how I learned to interpret graphs, and the answer is practice, practice, practice. This is actually quite tricky for teachers to arrange, as you need to have lots of sets of data for students to look at, and you need to make sure they are giving correct answers. Practice without feedback and correction can lead to entrenched mistakes.

Because graph interpretation is about pattern recognition, we need to have patterns that students can try to match the new graphs to. It helps to have some examples that aren’t beautifully behaved. The reality of data is that quite often the nature of measurement and rounding means that the graph appears quite different from the classic scatter-plot. The following graph has a strangely ordered look to it because the x-axis variable takes only whole numbers, and the prices are nearly always close to the nearest thousand.

The asking price of used Toyota sedans against the year of manufacture.

The asking price of used Toyota sedans against the year of manufacture.

Students also need examples of the different aspects that you would comment on in a graph, using appropriate vocabulary. Just as musicians need to label different types of scales in order to communicate with each other their musical ideas, there is a specific vocabulary for describing graphs. Unfortunately the art of describing scatterplots is not as developed as music, and at times the terms are unclear and even used in different ways by different people.

Materials produced for teacher development , available on Census @ School suggest the following things to comment on: Trend, Association, Strength, Groups and unusual observations.

The following uses the framework provided by R. Kaniuk, R. Parsonage

Trend covers the idea of whether the graph is linear or non-linear. I don’t really like the use of the word “trend” here, as to me it should be used for time-series data only. I would use the word “shape” in preference. It means a general tendency.

Association is about the direction. Is the relationship positive or negative? For example, “as the distance a car has travelled increases, the asking price tends to decrease.” The term “tends to” is very useful here.

Strength is about how close the dots are to the fitted line. In a linear model we can use correlation to quantify the strength. My experience is that students often confuse strength with slope.

Groups can appear in the data, and it is much more relevant if the appearance of groups is related to an attribute of the observations. For example in some data about food values in fast food, the dessert and salad items were quite separate from the other menu items. You can see that in the graph above of food items.

Unusual observations are a challenging feature of real-world data. Is it a mistake? Is it someone being silly, or misinterpreting a question? Is it not really from this population? Is it the result of a one-off rare occurrence (such as my redundancy payment earlier this year)? And what should you do with unusual observations? I’ve written a bit more about this in my post on dirty data. And there is uneven scatter, or heteroscedastiticity, which does not affect model definition, so much as prediction intervals.

On line practice works

An effective way to give students practice,  with timely feedback, is through on-line materials. Graphs take up a lot of room on paper, so textbooks cannot easily provide the number of examples that are needed to develop fluency. With our on-line materials we provide many examples of graphs, both standard, and not so well-behaved. Students choose from statements about the graphs. Most of the questions provide two graphs, as pattern recognition is easier to develop when looking at comparisons. For example if you give one graph and say “How strong is this relationship?”, it can be difficult to quantify. This is made easier when you ask which of two graphs has a  stronger relationship.

Students get immediate feedback in a “low-jeopardy” situation. When a tutor is working one-on-one with a student, it can be worrying to the student if they get wrong answers. The computer is infinitely patient and the student can keep trying over and over until they get their answers correct, thus reinforcing correct understanding.

This system and set of questions is part of our on-line resources for teaching Bivariate investigations, which occurs within the NZ Stats 3 course. You can find out more about our resources at www.statslc.com, and any teachers who wish to explore the materials for free should email me at n.petty(at)statslc.com.

Teaching experimental design

Teaching Experimental Design – a cross-curricular opportunity

The elements that make up a statistics, operations research or quantitative methods course cover three different dimensions (and more). There are:

  • techniques we wish students to master,
  • concepts we wish students to internalise, and
  • attitudes and emotions we wish the students to adopt.

Techniques, concepts and attitudes interact in how a student learns and perceives the subject. Sadly it is possible (and not uncommon) for students to master techniques, while staying oblivious to many of the concepts, and with an attitude of resignation or even antipathy towards the discipline.

Techniques

Often, and less than ideally, course design begins with techniques. The backbone is a list of tests, graphs and procedures that students need to master in order to pass the course. The course outline includes statements like:

  • Students will be able to calculate a confidence interval for a mean.
  • Students will be able to formulate a linear programming model from data.
  • Students will use Excel to make correct histograms. (Good luck with this one!)

Textbooks are organised around techniques, which usually appear in a given sequence, relying on the authors’ perception of how difficult each technique is. Textbooks within a given field are remarkably similar in the techniques they cover in an introductory course.

Concepts

Concepts are more difficult to articulate. In a first course in statistics we wish students to gain an appreciation of the effects of variation. They need to understand how data from a sample differs from population data. In all of the mathematical decision sciences students struggle to understand the nature of a model. The concept of a mathematical model is far from intuitive, but essential.

Attitudes

You can’t explicitly teach attitudes. “Today class, you are going to learn to love statistics!”. These are absorbed and formed and reformed as part of the learning process, as a result of prior experiences and attitudes. I have written a post on Anxiety, fear and antipathy for maths, stats and OR, which describes the importance of perseverance, relevance, borrowed self-efficacy and love in the teaching of these subjects. Content and problem context choices can go a long way towards improving attitudes. The instructor should know whether his or her class is more interested in the projectories of gummy bears, or the more serious topics of cancer screening and crime prevention. Classes in business schools will use different examples than classes in psychology or forestry. Whatever the context, the data should be real, so that students can really engage with it.

I was both amused and a little saddened at this quote from a very good book, “Succeed – how we can reach our goals”. The author (Heidi Grant Halvorson) has described the outcomes of some interesting experiments regarding motivation. She then says, “At this point, you may be wondering if social psychologists get a particular pleasure out of asking people to do really odd things, like eating Cheerios with chopsticks, or eating raw radishes, or not laughing at Robin Williams. The short answer is yes, we do. It makes up for all those hours spent learning statistics.” Hmmm

Experimental Design

So what does this have to do with experimental design?

I have a little confession. I’ve never taught experimental design. I wish I had. I didn’t know as much then as I do now about teaching statistics, and I also taught business students. That’s my excuse, but I regret it. My reasoning was that businesses usually use observational data, not experimental data. And it’s true, except perhaps in marketing research, and process control and possibly several other areas. Oh.

George Cobb, whom I have quoted in several previous posts, proposed that experimental design is a mechanism by which students may learn important concepts. The technique is experimental design, but taught well, it is a way to convey important concepts in statistics and decision science. The pivotal concept is that of variation. If there were no variation, there would be no need for statistics or experimentation. It would be a sad, boring deterministic world. But variation exists, some of which is explainable, and some of which is natural, some of which is due to sampling and some of which is due to bad sampling or experimental practices. I have a YouTube video that explains these four sources of variation. Because variation exists, experiments need to be designed in such a way that we can uncover as best we can the explainable variation, without confounding it with the other types of variation.

The new New Zealand curriculum for Mathematics and Statistics includes experimental design at levels 2 and 3 of the National Certificate of Educational Achievement. (The last two years of Secondary School). The assessments are internal, and teachers help students set up, execute and analyse small experiments. At level two (implemented this year) the experiments generally involve two groups which are given two treatments, or a treatment and a control. The analysis involves boxplots and informal inference. Some schools used paired samples, but found the type of analysis to be limited as a result.  At level three (to be implemented in 2013) this is taken a step further, but I haven’t been able to work out what this step is from the curriculum documents. I was hoping it might be things like randomised block design, or even Taguchi methods, but I don’t think so.

Subjects for Experimentation

Bearing in mind the number of students, many of whom wish to use other members of the class, there can be issues of time and fatigue.Here are some possibilities. It would be great if other suggestions could be added as comments to this post.

Behavioural

Some teachers are reluctant to use psychological experiments as it can be a bit worrying to use our students as guinea pigs. However, this is probably the easiest option, and provided informed and parental consent is received, it should be acceptable. All sorts have been suggested such as effects of various distractions (and legal stimulants) on task completion. There are possible experiments in Physical Education (Evaluate the effectiveness of a performance enhancing programme). Or in Music – how do people respond to different music?

I’d love to see some experiments done on time taken to solve Rogo puzzles! and what the effect of route length or number choice, or size or age is.

Biology

Anything that involves growing things takes a while and can be fraught. (My own recollection of High School biology is that all my plants died.) But things like water uptake could be possible. Use sticks of celery of different lengths and see how much water they take up in a given time. Germination times or strike rates under different circumstances using cress or mustard?  Talk to the Biology teacher. There are assessment standards in NZ NCEA at levels 2 and 3 which mesh well with the statistics standards.

Technology

Baking. There are various ingredients that could have two or three levels of inclusion – making muffins with and without egg – does it affect the height? Pretty tricky to control, but fun – maybe use uniform amounts of mixture. Talk to the Food tech teacher.

Barbie bungee jumping. How does Barbie’s weight affect how far she falls. By having Barbie with and without a backpack, you get the two treatments. The bungee cords can be made out of rubber bands or elastic.

Things flying through the air from catapaults. This has been shown to work as a teaching example. There are a number of variables to alter, such as the weight of the object, the slope of the launchpad, and the person firing.

Inject statistical ideas in application areas

John Maindonald from ANU made the following comment on a previous post: “I am increasingly attracted to the idea that the place to start injecting statistical ideas is in application areas of the curriculum.  This will however work only if the teaching and learning model changes, in ways that are arguably anyway necessary in order to make effective use of those teachers who have really good and effective mathematics and statistics and computing skills.”

How exciting is that? Teachers from different discipline areas work together! There may well be logistical issues and even problems of “turf”. But wouldn’t it be great for mathematics teachers to help students with experiments and analysis in other areas of the curriculum. The students will gain from the removal of “compartments” in their learning, which will help them to integrate their knowledge. The worth of what they are doing would be obvious.

(Note for teachers in NZ. A quick look through the “assessment matrices” for other subjects uncovered a multitude of possibilities for curricular integration if the logistics and NZQA allow. )

What Mathematics teachers need to know about statistics

My post suggesting that statistics is more vital for efficient citizens than algebra has led to some interesting discussions on Twitter and elsewhere. Currently I am beginning an exciting venture to provide support materials for teachers and students of statistics, starting with New Zealand. These two circumstances have led me to ponder about why maths teachers think that statistics is a subset of mathematics, and what knowledge and attitudes will help them make the transition to teaching statistics as a subject.

An earlier post called for mathematics to leave statistics alone. This post builds on that by providing some ways of thinking that might be helpful to mathematics teachers who have no choice but to teach statistics.

Statistics is not a subset of mathematics

Let me quote a forum post from a teacher of mathematics in New Zealand:

  • “It seems strange to me that Statistics is a small part of Mathematics (which also includes Trigonometry, Algebra, Geometry, Calculus … but for our year 13s and now our year 12s, it’s attained equal parity as one area with all the other branches of maths put together as another area.”

This is very helpful as it lets us see where the writer is coming from. To him, statistics is a subset of mathematics – a small part, and somehow it has managed to push its way to become on equal footing with “mathematics.”

I disagree.

I think we need to take a look at the role of compulsory schooling. It is popular among people who go to university, and even more so among those who never leave (having become academics themselves) to think that the main, if not only role of school is to prepare students for university. If the students somehow have not gained the skills and knowledge that the university lecturers believe are necessary, then the schools have failed – or worse still, the system has failed. Again I disagree.

The vision for the young people of New Zealand is stated in the official curriculum.

“Our vision is for young people:

  • who will be creative, energetic, and enterprising
  • who will seize the opportunities offered by new knowledge and technologies to secure a sustainable social, cultural, economic, and environmental future for our country
  • who will work to create an Aotearoa New Zealand in which Māori and Pākehā recognise each other as full Treaty partners, and in which all cultures are valued for the contributions they bring
  • who, in their school years, will continue to develop the values, knowledge, and competencies that will enable them to live full and satisfying lives
  • who will be confident, connected, actively involved, and lifelong learners.”

It doesn’t actually mention preparing people for university.

My view is that school is about preparing young people for life, while helping them to enjoy the journey.

What teachers need to know about statistics

Statistics for life can be summed up as C, D, E, standing for Chance, Data and Evidence.

Chance

Students need to understand about the variability in their world. Probability is a mathematical way of modelling the inherent uncertainty around us. The mathematical part of probability includes combinatorics and the ability to manipulate tables. You can use Venn diagrams and trees if you like, and tables can be really useful too. The most difficult part for many students is converting the ideas into mathematical terminology and making sense of that.

Bear in mind that perceptions of chance have cultural implications. Some cultures play board games and other games of chance from a young age, and gain an inherent understanding of the nature of uncertainty as provided by dice. However there are other cultures for whom all things are decided by God, and nothing is by chance. There are many philosophical discussions which can be had regarding the nature of uncertainty and variability. The work of Tversky and Kahnemann and others have alerted us to the misconceptions we all have about chance.

An area where the understanding of probabilities and relative risk is vital is that of medical screening. Studies among medical practitioners have shown that many of them cannot correctly estimate the probability of a false positive, or the probability of a true positive, given that the result of a test is positive. This is easily conveyed through contingency tables, which are now part of the NZ curriculum.

Data

When people talk about “statistics”, more often they are talking about data and information than the discipline of statistical analysis. Just about everyone is interested in some area of statistics. Note the obsession of the media for reporting the road toll and comparing with previous years or holiday periods. Sports statistics occupy many people’s thoughts, and can fill in the (long) gaps between the action in a cricket commentary. Weather statistics are vital to farmers, planners, environmentalists. Hospitals are now required to report various statistics. The web is full of statistics. It is difficult to think of an area of life which does not use statistics.The second thing we want to know about a new-born baby is its weight.

Just because data contains numbers does not make it mathematics. There are arithmetic skills, such as adding and dividing, which can be practised using data. But that’s about it when it comes to mathematics and data. These days we have computer packages which can calculate all sorts of summary values, and create graphs for better or worse, so the need for mathematical or numeracy skills is much diminished. What is needed is the ability to communicate ideas using numbers and diagrams; by communication I mean production and interpretation of reports and diagrams.

The area of data also includes the collection of data. This is taught at all levels of the NZ curriculum. Students are taught to think about measurement, both physical and through questionnaires. Eventually students learn to design experiments to explore new ideas. Some might see this as science or biology, social studies or psychology, technology and business. There are even applications in music where students explore people’s music preferences. Data occurs in all subjects, and really the skills of data analysis should be taught in context. But until the current generation of students become the teachers, we may need to rely on the teachers of statistics to provide support. There are wonderful opportunities for collaboration between disciplines, if our compartmentalised school system would allow them.

Evidence

Much data is population data and conclusions can easily be drawn from it. However we also use samples to draw conclusions about populations. Inferential statistics has been developed using theoretical probability distributions to help us use samples to draw conclusions about populations. Unfortunately the most popular form of inference, hypothesis testing, is counter-intuitive at best. Many teachers do not truly understand the application of inferential statistics – and why should they – they may never have performed a real statistical analysis. It is only through repeated application of techniques to multiple contexts that most people can start to feel comfortable and get some understanding of what is happening. The beauty is that today the technology makes it possible for students to perform multiple analyses so that they can learn the specific from the general.

The New Zealand school system has taken the courageous* step to introduce the use of resampling, also known as bootstrapping or randomisation, for the generation of confidence intervals. This is contentious and is causing teachers concern. I will dedicate a whole post to the ideas of resampling and why they may be preferable to more traditional approaches. I empathise with the teachers who are feeling out of their depth, and hope that our materials, along with the excellent ones provided by “Census at School” can be of help.

I have no doubt that educators all over the world are watching to see how this goes before attempting similar moves in their own countries. Yet again New Zealand gets to lead the world. Watch this space!

*In the popular British television show, “Yes Minister”, the public servant, Sir Humphrey, would use the term “courageous” to describe a proposal which was probably right, but also likely to lose votes.

Lies and statistics

One of the most famous sayings about statistics is the line: “There are three types of lies, lies, damned lies and statistics.” This was stated by author Mark Twain (Samuel Clements)  and quoted by British statesman Benjamin Disraeli.  There is a book entitled, “How to lie with statistics”. Within high school education students are taught about misleading graphs. It seems clear that statistics and facts are not the same thing. Yet one True/False question many of my students continue to get wrong says “Statistical analysis is an objective science, unaffected by the researcher’s opinions.” The correct response is False, yet 44% of students put True.  Referring back to my earlier post “You’re teaching it wrong”, I realise that I have work to do in helping students to recognise the subjective aspects of statistics.

Two scientists discussing

Statistics is not an objective science

It may be that the students are not sure about what is meant by subjective. Any post-modern researcher realises that very little is objective. We strive in science and analysis for “the facts” unsullied by human interpretation, but objectivity remains elusive in most endeavours. Like it or not, our own world-views affect the decisions we are required to make. We do not see the world as IT is, but rather as we are. Two people seeing the same scene can describe it totally differently, each convinced that he or she is correct and the other in error.

Subjectivity is generally unintentional. As part of the qualitative part of my mixed methods PhD research I was required to include a “statement of bias”, wherein I described my own views and circumstance which may have influenced my understanding of the data.  As my research related to the education of children with vision impairment, it was clear that having a son who is totally blind would affect my interpretation. However it is also important to bear in mind that a person who did NOT have a child with vision impairment would also be influenced by their own circumstances. It was also instructive to see how my views were affected by the research. My opinions of groups of people and circumstances and rights all changed over the right years of the study.

Subjective bias can creep into statistical analysis at all stages. I tell the students that when they read a statistical report it is important to think of the possible biases the person publishing it might have. The choice of significance level at which to reject the null hypothesis is a value judgment. The sample size, questions asked, order of the questions, manner of sampling, data cleaning methods and choice of which aspects to report or ignore are all judgements made by the person performing the test. The way data is represented in graphs and even the choice of vocabulary affect the interpretation of the “facts”. Sometimes the bias may seem clear, such as when funded by a company with vested interests. It is less clear when they are similar to our own biases. It can be difficult to find flaws in research which supports our own opinion.

The presence of subjectivity is important to teach at all levels of statistics, and is one of the places where mathematics and the decision sciences of Statistics and Operations Research part company. Not being a pure mathematician, I can only postulate that pure mathematicians believe that mathematics is objective and free of the taint of human bias.  But with statistics it is possible right from the early stages to point out how different students in a class have shown different things in graphs using the same data. This is exactly when statistics can become really exciting and thought-provoking rather than mechanistic number crunching. This is why statistics may be better taught in a social studies or science class, or at least in a cross-disciplinary setting.

It is not difficult to teach the subjective nature of statistics. It can be brought in as class discussions. Data should ALWAYS be within a context, which then means any discussion or evaluation of outcomes is rooted in the students’ experience and can be further analysed for validity and applicability to real life. It may require an attitude shift, away from the unique and satisfying correctness of mathematics, and it also may need care not to undermine confidence in all statistical analysis. It is important that this is seen as pivotal to statistical analysis, and not the messy stuff that happens around the edges. Case studies are useful for this. As usual in writing my blog I have come up with ideas that would improve my own course! I would love to hear if any of my readers implement any of these ideas.

As part of our first year Management Science paper we include a section on ethics. Students are required to identify possible conflicts of interest in scenarios, and the concept of worldviews is touched on. This is quite difficult for students in their late teens as they tend to be rather naive and “black and white” in their thinking. But to me this is the role of the university – to challenge their ways of thinking so that steam comes out their ears. It may be that the business students we get are less used to playing with ideas in the way that history or arts students may be. Whatever the reason, it is fun to challenge them.

In closing I’d like to say thanks for the support expressed in response to my previous post about the demise of Operations Research at UC. It is a loss to the country, as Mike Trick and others pointed out. And it is a tough time for my colleagues who are now looking for other work. The insights from the discipline of OR are valuable and I hope that the thousands of students we have taught over the years remember the subjective aspects of OR and statistics.