All models are wrong

In my title I quote George Box, who wrote,  “Essentially, all models are wrong, but some are useful“. I wish economists would remember this more often.

Statistics and Operations Research (and many other sciences) are based on the concept of a mathematical model. Aspects of a “real world” problem are quantified, analysed, explored, experimented with, sometimes even optimised, and the results are linked back to the original problem. This idea of a model is one we have tried to teach our students. It is a surprisingly difficult idea, and one that needs frequent revisiting. The following diagram is more complex than we would start with, but the idea of moving from the concrete, to the abstract and then back to the concrete is a helpful one. The diagram illustrates the iterative nature of problem-solving in Operations Research/Management Science, and has been used as a framework for an entry level, and MBA course.

This diagram of the OR/MS process illustrates the iterative nature of modelling, and the relationship between the model and the real world situation it represents.

It is interesting to see that “Build a mathematical model” sits right in the middle between qualitative and quantitative, between the concrete world and the abstract world. This is where the two meet, and where the challenge lies. This diagram was devised for Operations Research. I should try to make one for statistics.

I have recently been reading textbook sections about probability which include statements like, “The volumes of milk follow a normal distribution.” I have a problem with this. It makes it sound as if there were some rule in the universe which is making the volumes behave to fit a given distribution. I would prefer to say, “The volumes of milk can be modelled using a normal distribution.” This is closer to the truth. It is useful to have a model of the behaviour of the milk quantities, and we find that the normal distribution does a very good job. But the god of milk quanitities is not controlling the output of a machine in order to fit the normal distribution.

I worry when people get confused about the role and nature of a model. Economists have models of how the economy should behave under given circumstances. They use their models to predict, and at times attempt to prescribe the future. However when people don’t behave in the desired fashion, it seems to them that it is the people who are wrong and ought to change their behaviour, not the model. A new sub-area of Economics, Experimental Economics has originated to try to test how people really do behave, when they do not behave “rationally”, ie as economists think they ought to behave. I struggle to see how this is not a poor man’s version of psychology, especially as it appears many of the experiments are conducted on college students, who are paid to make gaming decisions in an environment of no jeopardy and no accountability. Their decisions are somehow meant to be representative of regular people making important decisions about their finances and future.

I wonder if the perception of a model as a rule controlling the universe is where some misconceptions about probability and chance originate. There is the gambler’s fallacy, that a sequence is somehow self-correcting. At a roulette wheel if we have observed a long string of black, then the series must soon correct itself and have some reds so that it can approach the 50:50 probability that it should display. Even though the probability of a red next turn is totally independent of what has happened before, we can’t help but think that the rules of the universe must be obeyed.

Perhaps it is this same misreading of a model that leads to the undue attribution of causation. A student sees that in regression model, as variable x increases, so does variable y. This ALWAYS happens in the model, and somehow this translates to causation in the real world context.

When we are teaching about models we need to be clear that they are models. “Word problems” in mathematics are often the things that students find difficult. I remember as a child, thinking, “just give me the numbers and I can work out the answer, but stop trying to confuse me with all these words.” And often the word problems are just a framework set up around the mathematical issue the teacher wishes to address. How much better it would be if the problems were authentic, and related to the pupils’ lives. Early exposure to the idea that the maths is not the reality, but only an abstraction, might help improve critical thinking in the future economists and decision-makers we teach.

And in the meantime I will write questions which say that the quantities of milk are appropriately modelled using the normal distribution.

About these ads

2 thoughts on “All models are wrong

  1. Pingback: Guest Post: Risk, Insurance and the Actuary | Learn and Teach Statistics and Operations Research

  2. Pingback: Deterministic and Probabilistic models and thinking | Learn and Teach Statistics and Operations Research

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s