Why I am going to ICOTS9 in Flagstaff, Arizona

I was a university academic for twenty years. One of the great perks of academia is the international conference. Thanks to the tax-payers of New Zealand I have visited Vancouver, Edinburgh, Melbourne (twice), San Diego, Fort Lauderdale, Salt Lake City and my favourite, Ljubljana. This is a very modest list compared with many of my colleagues, as I didn’t get full funding until the later years of my employ.

Academic conferences enable university researchers and teachers from all over the world to gather together and exchange ideas and contacts. They range from fun and interesting to mind-bogglingly boring. My first conference was IFORS in Vancouver in 1996, and I had a blast. It helped that my mentor, Hans Daellenbach, was also there, and I got to meet some of the big names in operations research. I have since attended two other IFORS conferences, and it is amazing how connected you can feel to people whom you meet only every few years. I always try to go to most sessions of the conference as I feel an obligation to the people who have paid to have me there. It is unethical to be paid to go to a conference, and then turn up only for a couple of sessions and the banquet. Sometimes sessions that I have only limited connection with can turn out to be interesting. I found I could always listen for the real world application that I could then include in my teaching. That would usually take up the first few minutes of the talk. Once the formulas appeared I would glaze over and go to my happy place. Having said that, I also think mental health breaks are important, and would take time out to reflect. I get more out of conferences if I leave my husband at home. The quiet time in my hotel room was also important for invigorating my teaching and research.

Most academic conferences focus on research, though they often have a teaching stream, which I frequent. ICOTS is different though as it is mostly about teaching, with a research stream! ICOTS stands for International Conference on Teaching Statistics, and runs every four years. I attended my first ICOTS in Slovenia in 2010. What surprised me was how many people there were from New Zealand! At the welcome reception I wandered around introducing myself to people and more often than not found they were also from New Zealand. How ironic to spend 40 hours getting to this amazing place and meet large numbers of fellow kiwis! (Named for the bird, not the fruit!). Ljubljana is a wonderful city, with fantastic architecture and lots of bike routes and geocaches. I made good use of my spare time. The conference itself was inspiring too. I attended just about every session, and gave a paper about making videos to teach statistics. I saw the dance of the p-value, and learned about statistics teaching in some African countries. I was impressed by the keynote by Gerd Gigerenzer, and went home and cancelled my mammogram. I put faces to some of the names in statistics education, though I was sad not to see George Cobb there, or Joan Garfield. What struck me was how nice everyone was. I loved my trip to some caves on the half-day excursion.

The point of this post is to encourage readers to go to ICOTS 9 in July this year. I admit I was a little disappointed when they announced the venue. I was hoping for somewhere a little more exotic. However the great benefit is that it is going to cost considerably less to get there than to many countries, and take less time. (For people from New Zealand and Australia, a trip of less than 24 hours is a bonus.) Now that I am no longer paid by a university to go to conferences, the cost is a big consideration. If necessary I will sell our caravan. Another benefit of the venue is it is very convenient for teachers from the US to attend. I am hoping to find out more about AP statistics, and other US statistics teaching.

I am currently reviewing an edited book published by Springer, Probabilistic Thinking. As I read each chapter I am increasingly excited that most of the authors will be attending ICOTS9. This is a great opportunity to discuss with them their ideas, and how to apply them in the classroom and in our resources. I am particularly interested in the latest research on how children and adults learn statistics and probability. This ICOTS I am doing a presentation about setting up a blog, Twitter and YouTube. In four years’ time I hope to be able to add to the research using what we have learned from students’ responses on our on-line resources.

I am a little apprehensive about the altitude and temperature, but have planned to arrive a few days early in Phoenix to acclimatise myself. In the interests of economy I will be staying at the university dorms, and just found out there is no air-conditioning in the bedrooms. My daughter-in-law from Utah tells me to buy a fan. I’m pretty happy about a trip to the Grand Canyon on the afternoon off.  The names of presenters and their abstracts are now available on the ICOTS9 website, so you can see what interesting times await.

I really hope I see a lot of you there – and not just New Zealanders.


The Myth of Random Sampling

I feel a slight quiver of trepidation as I begin this post – a little like the boy who pointed out that the emperor has  no clothes.

Random sampling is a myth. Practical researchers know this and deal with it. Theoretical statisticians live in a theoretical world where random sampling is possible and ubiquitous – which is just as well really. But teachers of statistics live in a strange half-real-half-theoretical world, where no one likes to point out that real-life samples are seldom random.

The problem in general

In order for most inferential statistical conclusions to be valid, the sample we are using must obey certain rules. In particular, each member of the population must have equal possibility of being chosen. In this way we reduce the opportunity for systematic error, or bias. When a truly random sample is taken, it is almost miraculous how well we can make conclusions about the source population, with even a modest sample of a thousand. On a side note, if the general population understood this, and the opportunity for bias and corruption were eliminated, general elections and referenda could be done at much less cost,  through taking a good random sample.

However! It is actually quite difficult to take a random sample of people. Random sampling is doable in biology, I suspect, where seeds or plots of land can be chosen at random. It is also fairly possible in manufacturing processes. Medical research relies on the use of a random sample, though it is seldom of the total population. Really it is more about randomisation, which can be used to support causal claims.

But the area of most interest to most people is people. We actually want to know about how people function, what they think, their economic activity, sport and many other areas. People find people interesting. To get a really good sample of people takes a lot of time and money, and is outside the reach of many researchers. In my own PhD research I approximated a random sample by taking a stratified, cluster semi-random almost convenience sample. I chose representative schools of different types throughout three diverse regions in New Zealand. At each school I asked all the students in a class at each of three year levels. The classes were meant to be randomly selected, but in fact were sometimes just the class that happened to have a teacher away, as my questionnaire was seen as a good way to keep them quiet. Was my data of any worth? I believe so, of course. Was it random? Nope.

Problems people have in getting a good sample include cost, time and also response rate. Much of the data that is cited in papers is far from random.

The problem in teaching

The wonderful thing about teaching statistics is that we can actually collect real data and do analysis on it, and get a feel for the detective nature of the discipline. The problem with sampling is that we seldom have access to truly random data. By random I am not meaning just simple random sampling, the least simple method! Even cluster, systematic and stratified sampling can be a challenge in a classroom setting. And sometimes if we think too hard we realise that what we have is actually a population, and not a sample at all.

It is a great experience for students to collect their own data. They can write a questionnaire and find out all sorts of interesting things, through their own trial and error. But mostly students do not have access to enough subjects to take a random sample. Even if we go to secondary sources, the data is seldom random, and the students do not get the opportunity to take the sample. It would be a pity not to use some interesting data, just because the collection method was dubious (or even realistic). At the same time we do not want students to think that seriously dodgy data has the same value as a carefully collected random sample.

Possible solutions

These are more suggestions than solutions, but the essence is to do the best you can and make sure the students learn to be critical of their own methods.

Teach the best way, pretend and look for potential problems.

Teach the ideal and also teach the reality. Teach about the different ways of taking random samples. Use my video if you like!

Get students to think about the pros and cons of each method, and where problems could arise. Also get them to think about the kinds of data they are using in their exercises, and what biases they may have.

We also need to teach that, used judiciously, a convenience sample can still be of value. For example I have collected data from students in my class about how far they live from university , and whether or not they have a car. This data is not a random sample of any population. However, it is still reasonable to suggest that it may represent all the students at the university – or maybe just the first year students. It possibly represents students in the years preceding and following my sample, unless something has happened to change the landscape. It has worth in terms of inference. Realistically, I am never going to take a truly random sample of all university students, so this may be the most suitable data I ever get.  I have no doubt that it is better than no information.

All questions are not of equal worth. Knowing whether students who own cars live further from university, in general, is interesting but not of great importance. Were I to be researching topics of great importance, such safety features in roads or medicine, I would have a greater need for rigorous sampling.

So generally, I see no harm in pretending. I use the data collected from my class, and I say that we will pretend that it comes from a representative random sample. We talk about why it isn’t, but then we move on. It is still interesting data, it is real and it is there. When we write up analysis we include critical comments with provisos on how the sample may have possible bias.

What is important is for students to experience the excitement of discovering real effects (or lack thereof) in real data. What is important is for students to be critical of these discoveries, through understanding the limitations of the data collection process. Consequently I see no harm in using non-random, realistic sampled real data, with a healthy dose of scepticism.

Statistics – Singular and Plural, Lies and Truth

Language is an issue in teaching and learning statistics. There are many words that have meanings in statistics, different from their everyday meaning, and even with multiple meanings within the study of statistics. Examples of troublesome words are: error, correlation, regression, significant, model. I wrote about addressing this in Teaching Statistical Language.

But the problem starts even with the name of the subject. There are at least three meanings for the term “statistics”. The word is not even consistently singular or plural. I suggest three meanings are: Data (plural), analysis (singular) and information (plural). What we teach focusses on the analysis, but involves data and information.

Statistics as Data

Sports people love statistics. Game shows and pub quizzes draw on data such as numbers of Olympic medals, wives, years of warfare, Oscars and a myriad other subjects. These statistics can be fascinating, relevant, boring or trivial. My most read blog post is entitled “Khan Academy Statistics videos are not good”. I suspect that quite a few people are searching for statistics about Khan Academy, rather than the subject of my post. This is borne out by the fact that a more recent post:  “Open Letter to Khan Academy about Basic Probability” gets considerably less traffic. I suppose there are not many people who want to know about the probability of Khan Academy. Pity – as the second post is better.

There is an entire discipline around “Official Statistics”. At a recent conference (ORSNZ/NZSA) I was fascinated by a presentation given about the need for statistics in a time of disaster and recovery. John Créquer talked about a subject close to my heart, the Christchurch earthquakes. In the weeks and months of the earthquakes authorities needed information of how many people there were of high need, in order to provide adequate service. Finding these numbers was an exercise in ingenuity and co-operation, drawing on data collected for other purposes. The presenter suggested that at times like that a national register would be invaluable. New Zealand does not have such a thing. It is an interesting conflict between the need for privacy and the public good. Créquer is a statistician from Statistics New Zealand, who has been contracted to CERA (The Canterbury Earthquake Recovery Authority) for now.  I had never thought that a statistician had uniquely valuable skills and insights to be used in a time of recovery from disaster.

The internet is an amazing source of the data kind of statistics. You can find out the number of an awful lot of things, simply by putting the question in a search box, or looking on Wikipedia. (I’ve made my annual monetary contribution – have you?). Thanks to Wikipedia, we don’t need to wonder about trivial things anywhere near as much as we used to.

Statistics as Analysis

Statistics, as it is taught and learned as a subject, mostly refers to statistical analysis and the inquiry process in which it is embedded. I sometimes wonder what people are thinking when I say that I produce materials to help people learn statistics. Do they imagine a classful of students memorising the populations of countries and batting averages?

“It is easy to lie with statistics. It is hard to tell the truth without it.”

This quote is from Andrejs Dunkels, a person whom I wish I had met. When I was looking for the source of this quote, I found a tribute page to a man who contributed greatly to the world of statistics. His quote uses statistics as a singular noun.

The analysis aspect of statistics involves taking raw data and turning it into information and evidence of what may be truth. Science would not progress far without the tools of statistics to take the raw results of experiments and observations, and using the insights gained by the mathematical world of probability, discern their significance. Without the discoveries and tools of statistics we would not be able to make sensible inference about populations from samples and experiments.

Statistical analysis uses mathematical tools, but is far more than just the mathematics. It is easy to produce wrong information by using the mechanistic calculations without thinking critically about the results. I once produced some very wrong models of performance of bank branches, using multiple regression. I even came up with some interesting rationalisations for the counter-intuitive results. Then I did a residual plot and found one outlier that changed everything! Once I removed it, the models changed to the extent that some of the coefficients changed sign. I wonder how many wrong models persist because of well-intentioned, but unskilled analysts.

There is a wonderful paragraph I used to quote in my second year statistical methods class, that unfortunately I can’t find – even using Wikipedia. It says, in essence: Statistical models are not sausage machines, taking in data and turning it into information without the interference of a human. If the results do not make sense and align with common understanding of the phenomenon, they are probably wrong.

If someone can direct me to the actual quote, I’d be very happy. I used to get the class to recite it in unison.

The point I am making is that the second meaning of statistics is a combination of science and art. It needs people.

Statistics as Information

This is similar to the first meaning, but I think that processed data should have a home separate from raw data. Statistical results include relationships and differences, not just “the facts.” I would put graphs and tables into this category. I think this category is scarier than statistics as data. Everyone can understand that Henry the Eight had six wives, and New Zealand won six gold medals at the London Olympics. Those are non-scary statistics, and easily accessible. They are statistics as data or facts.

What is more daunting to many people is the results of analysis. This is where we try to explain the population effect of cancer screening, the significance (statistical) of an increase or decrease in birthrate, the effect of seasonality on the sales of jewellery in the USA, the evidence that increasing numbers of cows are causing a degradation of water quality in natural water sources. These statistics need to be well presented. Part of our role as teachers is to help future producers of such information to be able to express themselves well so these statistics are accessible. Another part of our role is help future consumers of statistics to understand them.

Our role is important – for all three types of statistics.

Deterministic and Probabilistic models and thinking

The way we understand and make sense of variation in the world affects decisions we make.

Part of understanding variation is understanding the difference between deterministic and probabilistic (stochastic) models. The NZ curriculum specifies the following learning outcome: “Selects and uses appropriate methods to investigate probability situations including experiments, simulations, and theoretical probability, distinguishing between deterministic and probabilistic models.” This is at level 8 of the curriculum, the highest level of secondary schooling. Deterministic and probabilistic models are not familiar to all teachers of mathematics and statistics, so I’m writing about it today.


The term, model, is itself challenging. There are many ways to use the word, two of which are particularly relevant for this discussion. The first meaning is “mathematical model, as a decision-making tool”. This is the one I am familiar with from years of teaching Operations Research. The second way is “way of thinking or representing an idea”. Or something like that. It seems to come from psychology.

When teaching mathematical models in entry level operations research/management science we would spend some time clarifying what we mean by a model. I have written about this in the post, “All models are wrong.”

In a simple, concrete incarnation, a model is a representation of another object. A simple example is that of a model car or a Lego model of a house. There are aspects of the model that are the same as the original, such as the shape and ability to move or not. But many aspects of the real-life object are missing in the model. The car does not have an internal combustion engine, and the house has no soft-furnishings. (And very bumpy floors). There is little purpose for either of these models, except entertainment and the joy of creation or ownership. (You might be interested in the following video of the Lego Parisian restaurant, which I am coveting. Funny way to say Parisian!)

Many models perform useful functions. My husband works as a land-surveyor, and his work involves making models on paper or in the computer, of phenomenon on the land, and making sure that specified marks on the model correspond to the marks placed in the ground. The purpose of the model relates to ownership and making sure the sewers run in the right direction. (As a result of several years of earthquakes in Christchurch, his models are less deterministic than they used to be, and unfortunately many of our sewers ended up running the wrong way.)

Our world is full of models:

  • a map is a model of a location, which can help us get from place to place.
  • sheet music is a written model of the sound which can make a song
  • a bus timetable is a model of where buses should appear
  • a company’s financial reports are a model of one aspect of the company

Deterministic models

A deterministic model assumes certainty in all aspects. Examples of deterministic models are timetables, pricing structures, a linear programming model, the economic order quantity model, maps, accounting.

Probabilistic or stochastic models

Most models really should be stochastic or probabilistic rather than deterministic, but this is often too complicated to implement. Representing uncertainty is fraught. Some more common stochastic models are queueing models, markov chains, and most simulations.

For example when planning a school formal, there are some elements of the model that are deterministic and some that are probabilistic. The cost to hire the venue is deterministic, but the number of students who will come is probabilistic. A GPS unit uses a deterministic model to decide on the most suitable route and gives a predicted arrival time. However we know that the actual arrival time is contingent upon all sorts of aspects including road, driver, traffic and weather conditions.

Model as a way of thinking about something

The term “model” is also used to describe the way that people make sense out of their world. Some people have a more deterministic world model than others, contributed to by age, culture, religion, life experience and education. People ascribe meaning to anything from star patterns, tea leaves and moon phases to ease in finding a parking spot and not being in a certain place when a coconut falls. This is a way of turning a probabilistic world into a more deterministic and more meaningful world. Some people are happy with a probabilistic world, where things really do have a high degree of randomness. But often we are less happy when the randomness goes against us. (I find it interesting that farmers hit with bad fortune such as a snowfall or drought are happy to ask for government help, yet when there is a bumper crop, I don’t see them offering to give back some of their windfall voluntarily.)

Let us say the All Blacks win a rugby game against Australia. There are several ways we can draw meaning from this. If we are of a deterministic frame of mind, we might say that the All Blacks won because they are the best rugby team in the world.  We have assigned cause and effect to the outcome. Or we could take a more probabilistic view of it, deciding that the probability that they would win was about 70%, and that on the day they were fortunate.  Or, if we were Australian, we might say that the Australian team was far better and it was just a 1 in 100 chance that the All Blacks would win.

I developed the following scenarios for discussion in a classroom. The students can put them in order or categories according to their own criteria. After discussing their results, we could then talk about a deterministic and a probabilistic meaning for each of the scenarios.

  1. The All Blacks won the Rugby World Cup.
  2. Eri did better on a test after getting tuition.
  3. Holly was diagnosed with cancer, had a religious experience and the cancer was gone.
  4. A pet was given a homeopathic remedy and got better.
  5. Bill won $20 million in Lotto.
  6. You got five out of five right in a true/false quiz.

The regular mathematics teacher is now a long way from his or her comfort zone. The numbers have gone, along with the red tick, and there are no correct answers. This is an important aspect of understanding probability – that many things are the result of randomness. But with this idea we are pulling mathematics teachers into unfamiliar territory. Social studies, science and English teachers have had to deal with the murky area of feelings, values and ethics forever.  In terms of preparing students for a random world, I think it is territory worth spending some time in. And it might just help them find mathematics/statistics relevant!

Guest Post: Risk, Insurance and the Actuary

Risk, Insurance, and the Actuary

Risk is an inherent part of our daily life. As a result, most of us, take out insurance policies as a means of protection against scenarios which, were they to occur, may cause hardship whether for us or, as in the case of life insurance, for our families.

Insurance companies write many types of policies. The mutual risks of the policy holders are shared so that claims made against the policies can be covered at a much reduced cost. If priced fairly, then the premium reflects the contribution of the insured’s risk to overall risk.

As policy holders – we want the best price to cover the risk we are offloading; shareholders (again often us if we have superannuation)of the insurance company –require the premiums be sufficient to ensure the company stays in business.

It is then very important that analysts pricing the policies (and those calculating the required level of capital to meet the claim liabilities) have the statistical knowledge necessary to measure risk accurately! Understanding risk is even more critical in the framework of Solvency II (*) capital requirements (if it ever gets enforced).

The task is made more difficult as the duration of the policy life varies considerably. Some insurance cover is claimed against shortly after the incident occurs with a short processing time – automobile accidents for instance typically fit this category. This class of cover is termed short-tail liabilities as payments are completed within a short timeframe of the incident occurring.

Other cases arise many years after the original policy was taken out, or payments may occur many years after the original claim was raised – for example medical malpractice. These are termed long-tail liabilities as payments may be made long after the original policy was activated or the incident occurred. Due to the long forecast horizon and [generally] higher volatility in the claim amounts, long-tail liabilities are inherently more risky.

Life insurance is in its own category as everybody dies sometime.

Meet the data

For convenience, and because it is generally less well understood, we restrict our focus to long-tail liability insurance data

For each claim we have many attributes, but four that are universal to all claims: payment amount(s), incident date (when the originating event resulting in the claim occurred), payment date(s), and state of claim (are further payments possible or is the claim settled). These attributes allow the aggregation of the individual claim data into a series more amenable for analysis at the financial statement level where the volatility of individual claims should be largely eliminated since the risk is pooled.

Actuaries tend to present their data cumulatively in a table like this:

Actuarial tableWhere the rows are accident years, and the column index (development time in actuarial parlance) is the delay between the accident year and the year of payment.

Thus payments made in development lag 0 corresponds to all payments made toward claims in the year the accident occurred. The values in development lag 10 correspond to the sum of the payments made in the eleven years since the accident occurred.

This presentation likely arose for a number of reasons, but the most important two being:

  • Cumulative data are much easier to work with in the absence of computers;
  • Volatility is visibly less of an issue the further in the development tail when examining cumulatives.

The nature of the inherited data presentation produces some unfortunate consequences:

  • Variability is hard to quantify between parameter uncertainty and process volatility;
  • Calendar year effects (trends down the diagonals) are unable to be measured – and therefore readily predicted;
  • Parameter interpretation is difficult due to the calendar year confounding effects; and
  • Parsimony is hard to achieve.

The actuarial profession attempts to deal with each of these issues in various ways. For instance, the bootstrap is being used to quantify variability. Data may be indexed against inflation to partially account for calendar year trends.

Why spend time on this?

Fundamentally because, if you want to solve a problem, you first have to be sure that the data you are using and the way you are using it allows you to solve the problem! The profession has spent much time, energy, and analysis on developing techniques to solve the risk measurement problem but with the underlying assumption that cumulation is the way to analyse insurance data.

Aside: this is why I enjoy Genetic Programming – not because the algorithm allows the automatic generation of solutions, but rather because you have to formulate the problem very precisely in order to ensure the right problem is solved.

Understanding the problem

The objective of analysis of the Insurance portfolios is to quantify the expected losses incurred by the Insurance company and the volatility (the risk) associated with the portfolio so adequate money is raised to pay all liabilities, at a reasonable price, with an excellent profit. Additional benefits may arise like an improved understanding of the policies being written, targeting of more profitable customers, and so forth, but these are secondary.

Assume the data available are the loss data with the three attributes of accident time, calendar time, and payment. Forget about claim state for now though this is an important factor for future projections.

We immediately identify two time attributes. This suggests time series models are likely a good starting point for analysis. We also would examine the distribution(s) of incremental losses rather than cumulate the losses over time since cumulation of time series would hide the volatility of the losses at the individual time points – the very component that we are interested in.

Further, we need the ability to distinguish between parameters, parameter uncertainty, and the process volatility. Process volatility and parameter uncertainty drive the critical risk metrics which are essential to ensuring adequate capital is set aside to not only cover the expected losses, but also allow for the unexpected losses should they occur.

Beginning with this foundation, modelling techniques which take the fundamental time-series nature of the data into account are almost certain to provide superior performance to methodologies which mask (for historical reasons mentioned) the time series nature of the data.

Is this new?

Actually, no. All the above considerations of analysis of P&C insurance data were presented many years ago. However, time series approaches are not typically taught to aspiring P&C actuaries. Why?

Perhaps several reasons:

  • Tradition. Like any specialised profession, a system is developed to provide solutions and unless the system is convincingly broken, the uptake of new methodology is resisted.
  • Statistical analysis is complicated. Applying standard formula to get answers is “easy” when you know the formula.

The catch

Misrepresenting data leads to a flawed model representing the underlying data processes.

The likelihood of such a methodology resulting in the correct mean or a correct measure of the volatility is extremely low. The distributional assumptions are likely completely spurious as the fundamental nature of the data is not recognised.

Wrong model = wrong conclusion, unless you’re unlucky

It is often a general problem where the wrong statistical technique is applied to solve a statistical problem. This suggestion the statement: “All models are wrong, but some are useful.” This is not entirely fair in my mind as it (wrongly) places the blame on the model where the blame should actually be on the analyst and their choice of the modelling method.

Although we will never find the model driving the underlying data generating process, nevertheless, we can often well approximate the data process (otherwise modelling of any kind would be pointless). These are the useful models. Then you are only unlucky if your model looks like it is useful, but fails when it comes to prediction.

In summary

  • The problem of quantifying risk is not a simple exercise
  • Insurance data is fundamentally financial time series data
  • The right starting point is critical to any statistical analysis
  • We statisticians need to explain our solutions in a way that is meaningful to established professions

(*) In essence, Solvency II comprises insurance legislation aiming to improve policyholder protection by introducing a clear, comprehensive framework for a market consistent, risk model. In particular, insurance companies must be able to withstand a 1/200 year loss event in the next calendar year encompassing all levels of risk sources – insurance and reserve risk, catastrophe risk, operational risk, default risk to name a few.  Quantitative impact study documents are available here; a general discussion of Solvency II can be found here. The legislation has been postponed many times.

About David Munroe

David Munroe leads Insureware’s outstanding statistical department. Comments in this article are the authors own and do not necessarily represent the position of Insureware Pty Ltd.

He completed an Masters degree in Statistics (with First Class Honours) from Massey University, New Zealand.

David has experience in statistical and actuarial analysis along with C++ programming knowledge. Previous projects include working with a Canadian Insurance company to software training and implementation purposes resulting in significant modelling improvements (regions can be modelled within a working day allowing analysts to focus on providing extracted insights to management).

David studied the art of Shaolin Kempo for over nine years, holds a second degree black belt, and is qualified in the use of Okinawan weaponry. He is also interested in music (piano), literature, photography, and self sufficiency. He also has two children on the autism spectrum.

Analysis of “Deal or No Deal” results

Deal or No Deal

My son, Jonathan, loves game-shows, and his current favourite is Deal or No Deal, the Australian version. It has been airing now for over ten years, and there is at least one episode available every weeknight on New Zealand television. I often watch it with him as it is a nice time to spend together. We discuss whether people should take the deal or not, and guess what the bank offer will be. There are other followers of the programme, equally devoted, and I am grateful to Paul Corfiatis and his mum who fastidiously collected data for all the 215 programmes in 2009 on the final takings, the case chosen and the case containing the $200,000. In this post I analyse this data, and give some ideas of how this can be used in teaching.

Deal or No Deal, explained

You can find out ALL about Deal or No Deal on Wikipedia. I was excited to see our New Zealand radio gameshow, “The Money or the Bag” given as an antecedent.  There are numerous incarnations of the game. The basic idea is that there are 26 cases, containing a range of money values from 50c to $200,000. The money values are randomly assigned and their allocation unknown to the contestant and the “banker”. The contestant chooses one of the cases, and chats to the host, Andrew O’Keefe, about what they will do with the money when they win. The usual responses are to have a big wedding or travel. As the programme is filmed in Melbourne, often second generation Australians are wanting to visit their parents’ homeland.   Usually the contestant has a friend or family member as a podium player, who interacts as part of the banter. In the first round, the player chooses six cases to open, thus gaining information about the possible value in their case. At the end of the round, the banker offers a sum of money to buy back the case from the contestant, who must choose, “Deal” (take the money) or “No Deal”, keep the case and its contents. In the second round five cases are opened and then there is another bank offer. This continues until the sixth round, and from then the cases are opened one at a time, with an offer made after each one. The player either takes the deal at some point, or holds out until the end, at which point they take the contents of the case. There are other variants on this basic game, to add variety.

Human aspects

My son is blind and has autism, and finds much to like about this programme. He likes the order of it all – every night, a very similar drama is played out, and he can understand exactly what is happening. He also likes the agony and the joy. He gets very excited when the case containing $200,000 is opened with the special sound effect, and Andrew says, “Oh No”. He likes hearing about the people, and their lives and he likes that you never know how much you might win.

I also like the drama and the joy, but I’d rather not watch when it is going badly. I like it because it is an insight into people’s perceptions of chance. Like many people, I yell at the screen, telling them to take the deal when we see them being reckless, but I am usually happy when their foolish decisions turn out well.  To me it is a true reality show – not because the situation is in any way like reality, but because the people are authentic in their responses. I have been known to weep when a nice person wins a sizeable amount of money. One day I would love to go on the show, as I know how much joy that would bring Jonathan, to be a part of it.

Part of the appeal is the collective experience of it all. The podium players, the audience and the people at home feel connected to the main contestant. One episode that Jonathan loves to tell people about is with Josh Sharpe who was REALLY unlucky. You can see that here on YouTube:

The probability

The probability calculation for Deal or No Deal is very simple. The contestant has one chance in twenty-six that their case contains the big prize. They have four chances in twenty-six that their case contains a prize of $50,000 or more. The expected value of their prize, if they hold onto their case to the end, is about $19,900 (valuing the car at $30,000). When the dealer makes an offer, it is often around the expected value of the remaining unopened cases. (The average amount left.) There are times when the offer is considerably lower or higher than the expected value, which seems to be in an effort to push the contestant one way or the other. Contestants very seldom take the deal in the early rounds of the game.

There are a number of interesting questions we can explore:

  • What is the distribution of the actual outcomes for contestants?
  • How often do contestants do better than what is in their case?
  • Are there any “lucky” cases that contain the big prize more often than others?

To explore these questions I am using the data so diligently collected by Paul Corfiatis. I will use data from games with the regular list of prizes, not “Fantastic Four”, which has some more high value cases.

What is the actual outcome for the contestants?

The following graph shows the amount of money the contestants win, either by taking the deal or hanging out for the case.


You could have an interesting discussion about the factors to account for in looking at this. You would expect the mean to be lower for the “case” prizes, as they tend to be people who have kept going to the bitter end. There is a very large standard deviation.

Here is a table of results:

Case Deal Either case or deal
Number of instances 53 146 199
Mean $6139 $21,044 $17,075
Median $500 $18,350 $15,000
Standard Deviation $13,740 $14,499 $15,721
Minimum $0.50 $950 $0.50
Maximum $50,000 $100,000 $100,000

How often do contestants do better than what is in their case?

For this I calculated the prize less the amount that was in their case. The mean value was $1082, with a median of $9969.50, minimum of -$170,050 and a maximum of $99,995. Contestants who took the deal, did better, 106 times out of 146, or 73% of the time.

Lucky Cases

And of course the one to make the statisticians smile – are there any lucky cases?

Here is a graph of the distribution of cases that held the $200,000. I am tempted to make glib comments about how clearly 14 is a lucky case, so you should pick that one, but then, maybe you should pick 19, as it hasn’t had the $200,000 much. But as you never know who is going to quote you, I’d better not.

Which case contained the $200,000 in 2007.

Which case contained the $200,000 in 2007.

Educational use for this

Depending on how much you wish to torment your students, and the educational objectives, you could give them the raw data, as provided on the site, and see what they come up with.  Or you could simply present the results given in this post, watch an episode, and discuss what meanings people could take from the data, and what misconceptions might occur.

About blogging

This is the 100th post on “Learn and Teach Statistics and Operations Research”. To celebrate, I am writing about the joys of blogging.

Anyone with an internet connection can blog these days, and do! It is the procrastinator’s “dark playground” to read blogs on pretty much anything you want to know. (For an explanation, with pictures, of the dark playground, where the instant gratification monkey holds sway until the panic monster arrives, see this entertaining post: Why Procrastinators Procrastinate.)

I started to blog to build a reputation for knowing about teaching statistics and operations research. This would lead people to buy our apps, subscribe to our on-line materials and watch my YouTube videos. Many blogs are set up, like this, in order to build credibility and presence on the internet. I’ve found it quite exciting to watch the readership grow, and I particularly love it when people comment. I also like to feel that I am doing some good in the world. The process of writing is also a learning process for me.

Here are some lightly structured thoughts about what I’ve learned over the last 99 posts.

A blog is not a scholarly research paper

As I come from an academic background, I have had to remind myself that a blog is different from a scholarly research paper. A blog isn’t scholarly, it isn’t based on research (unless you can call time in the shower that) and it isn’t on paper.

Blogging rewards bad behaviour.

The more opinionated you are, and the less evidence you use to support your argument, the more readers you get.  You must remove equivocation. Often after I write my first draft, I go through and remove statements like “in my opinion” or  “it seems”.  This is the antithesis of a scholarly paper, which must be carefully stated in balanced and measured tones.

Blogs are personal

It is good to be personal in a blog. In journal articles we avoid the use of first person language as if the paper were somehow written by itself. This can give rise to convoluted sentence structures and endless passive voice. When I write my blog, I talk about my own ideas, and even aspects of my life. I mention side tracks, and give a little bit of myself. And I prefer to read blogs that have a bit of the author in them. I think you need a little touch of narcissism to enjoy blogging.

Quantity is more important than quality

Volume in blogging dominates quality. Some might argue that this is also true for academic papers. In a blog you are better to dash off one opinion piece a week, than put the same effort into one scholarly paper. If one falls flat, it really doesn’t matter.

Blogs give instant gratification

Blogs have a quick turn-around, ideal for people with short attention spans who want instant gratification. In academia the delay between doing the research and seeing it in print is measured in years. By the time an article has been through the review process, you have almost forgotten why you did the research in the first place. And don’t really care anymore. But when you blog and click “Publish”, it is out there in the world for all to see.

People read blogs

People read your blog. It is an amazing feeling to send out my thoughts into the world and watch the viewing stats on WordPress, knowing that hundreds and sometimes even thousands of people are reading my opinion, literally all over the world. And sometimes I even get emails from fans, telling how my post has helped them or inspired them to work or do research in the area of statistics education. Or else I find that an educational institution has set a link to one of my posts for their students to read. In contrast I wonder if anyone has ever read my journal articles, apart from the reviewers. Not only do people read your blog, but you can see where they live and what they read, and even what search engine terms brought you to the blog. Some search terms boggle the mind, first that someone entered them, and secondly that they led to my blog! The term “rocks” has led to my site 66 times in the last two years, which I am sure was disappointing for the searcher. The most common search term is “causation”.

Blogging does not get you promoted

Though blogging is fun and great for attention-seekers, it does not improve your PBRF ratings (in NZ) or whatever the measure of publication activity is in a specific country. Nor does blogging count for promotion or tenure. This may be simply a matter of time to allow attitudes to change, as erudite blogs can get scientific findings out into the public domain far more rapidly than the old print-based system.

People can be mean

A blogger needs to have a thick skin. I don’t yet, and have to remind myself that I didn’t research my article, so it is only fair for people to offer opposing views. In fact, one the great qualities of a blog is that anyone can respond and improve the quality of the blog. I love it when people leave comments; it is the emailed “hate-messages” that are a bit upsetting.

Keynote speaker

One spin off of a successful blog is that you get asked to be a keynote speaker.

Actually I’m kidding on that one. I’d love to be a keynote speaker, and I’m pretty sure I could entertain a crowd and give them something to think about for an hour or so, but it hasn’t happened. Yet. Any invitations?

Proving causation

Aeroplanes cause hot weather

In Christchurch we have a weather phenomenon known as the “Nor-wester”, which is a warm dry wind, preceding a cold southerly change. When the wind is from this direction, aeroplanes make their approach to the airport over the city. Our university is close to the airport in the direct flightpath, so we are very aware of the planes. A new colleague from South Africa drew the amusing conclusion that the unusual heat of the day was caused by all the planes flying overhead.

Statistics experts and educators spend a lot of time refuting claims of causation. “Correlation does not imply causation” has become a catch cry of people trying to avoid the common trap. This is a great advance in understanding that even journalists (notoriously math-phobic) seem to have caught onto. My own video on important statistical concepts ends with the causation issue. (You can jump to it at 3:51)

So we are aware that it is not easy to prove causation.

In order to prove causation we need a randomised experiment. We need to make random any possible factor that could be associated, and thus cause or contribute to the effect.

There is also the related problem of generalizability. If we do have a randomised experiment, we can prove causation. But unless the sample is also a random representative sample of the population in question, we cannot infer that the results will also transfer to the population in question. This is nicely illustrated in this matrix from The Statistical Sleuth by Fred L. Ramsey and Daniel W Schafer.

The relationship between the type of sample and study and the conclusions that may be drawn.

The relationship between the type of sample and study and the conclusions that may be drawn.

The top left-hand quadrant is the one in which we can draw causal inferences for the population.

Causal claims from observational studies

A student posed this question:  Is it possible to prove a causal link based on an observational study alone?

It would be very useful if we could. It is not always possible to use a randomised trial, particularly when people are involved. Before we became more aware of human rights, experiments were performed on unsuspecting human lab rats. A classic example is the Vipeholm experiments where patients at a mental hospital were the unknowing subjects. They were given large quantities of sweets in order to determine whether sugar caused cavities in teeth. This happened into the early 1950s. These days it would not be acceptable to randomly assign people to groups who are made to smoke or drink alcohol or consume large quantities of fat-laden pastries. We have to let people make those lifestyle choices for themselves. And observe. Hence observational studies!

There is a call for “evidence-based practice” in education to follow the philosophy in medicine. But getting educational experiments through ethics committee approval is very challenging, and it is difficult to use rats or fruit-flies to impersonate the higher learning processes of humans. The changing landscape of the human environment makes it even more difficult to perform educational experiments.

To find out the criteria for justifying causal claims in an observational study I turned to one of my favourite statistics text-books, Chance Encounters by Wild and Seber  (page 27). They cite the Surgeon General of the United States. The criteria for the establishment of a cause and effect relationship in an epidemiological study are the following:

  1. Strong relationship: For example illness is four times as likely among people exposed to a possible cause as it is for those who are not exposed.
  2. Strong research design
  3. Temporal relationship: The cause must precede the effect.
  4. Dose-response relationship: Higher exposure leads to a higher proportion of people affected.
  5. Reversible association: Removal of the cause reduces the incidence of the effect.
  6. Consistency: Multiple studies in different locations producing similar effects
  7. Biological plausibility: there is a supportable biological mechanism
  8. Coherence with known facts.

Teaching about causation

In high school, and entry-level statistics courses, the focus is often on statistical literacy. This concept of causation is pivotal to correct understanding of what statistics can and cannot claim. It is worth spending some time in the classroom discussing what would constitute reasonable proof and what would not. In particular it is worthwhile to come up with alternative explanations for common fallacies, or even truths in causation. Some examples for discussion might be drink-driving and accidents, smoking and cancer, gender and success in all number of areas, home game advantage in sport, the use of lucky charms, socks and undies. This also ties nicely with probability theory, helping to tie the year’s curriculum together.

Absolute and Relative Risk

It is important that citizens can make sense out of the often outrageous claims of advertisers and pro-screening advocates.  It isn’t what they say, but how they say it. What looks like a very large and scary increase in risk, can in fact make very little practical difference. Conversely a large risk can be made to look smaller through the manner in which it is communicated.

I found a wonderful set of notes on the Census at School site, presented as a powerpoint file.

I also found several very interesting and educational sites about risk.

This first one explains about risk and relative risk: Science blog on Cancer Research UK

This one also includes Number needed to treat. Patient Health UK.

And a here is a great summary and set of exercises at the Auckland Maths Association website. You need to scroll down to “Relative Risk Resources”. (I found this after writing the rest of the blog, and it pretty much says what I say, but more succinctly!)

Teaching about Risk

Risk is a great topic for teaching about probability, percentages and perception.

It’s what’s on the bottom that counts!

In exploring risk, there are several distinct processes needed. Depending on the format in which the information is given, students may need to construct their own frequency table, or interpret the one provided. From the frequency table they must calculate the probability, making sure that they choose the correct denominator. Then if they are looking for relative risk, they need to make sure that they again choose the correct denominator. For some reason, the numerator is usually easier. But what can be tricky is the denominator.

We can use as an example the increase in probability of passing a particular statistics course if students use our Statistics Learning Centre materials to help them. We haven’t collected any data yet, so these figures are aspirational (as in a work of fiction!). Because we are talking about risk, we have to frame the outcome in negative terms. We would not talk about the risk of passing a course, but rather of failing one. So we will say that students who use StatsLC materials reduce their risk of failing by 66.7% percent. That is pretty impressive, but how much better it sounds if we frame it in terms of how much their risk will increase if they decide not to use the wonderful materials from StatsLC. Their risk of failure increases by 200%. That sounds pretty drastic.

But what we have failed to mention is the absolute risk, which is the proportion of students who fail their stats courses with and without the help of StatsLC. Here are some pairs of absolute risks that will give the results given:

All of the following sets of numbers show a 200% increase in risk of failure for students who do not use StatsLC materials.


Risk of failing, when using StatsLC materials

Risk of failing when they don’t use StatsLC materials

Actual increase in risk of failing.













In Scenario A, the pass-rate for the statistics course has gone from 97% to 99%. In scenario B, the pass-rate has gone from 70% to 90%, and in Scenario C, the pass-rate has gone from 40% to 80%. All of these scenarios could accurately be described by the same change in relative risk. They all double the risk of failing if the student does not use StatsLC.

This is really at the end of the story, based on what is reported. But if we wish to find out what is really going on, the best idea is to build a table of natural frequencies. These are great for calculating conditional probabilities by stealth.

Here is a table of natural frequencies for Scenario C above, using 1000 as our total number of people. Before we fill it out, we also need to know how many people used Statistics Learning Centre materials. 30% of students did NOT use StatsLC materials.



Total in category

Use StatsLC

80% of 700 = 560

20% of 700 = 140


Do not use StatsLC

40% of 300 = 120

60% of 300 = 180


Total pass or fail




From this table, all manner of statistics can be computed.

What proportion of students who passed, used the StatsLC materials?

The answer is (the number of people who passed AND used StatsLC materials)/( the number of people who passed) = 560/680 =82%. It is important to find the correct denominator.

Then when people calculate relative risk, it is important to be careful about choosing the baseline.

Another question might be, by how much does your risk of failure decrease, in relative terms, if you use the StatsLC materials?

The first step is to find the decrease in absolute terms. The risk of failure, not using StatsLC = 0.6. The risk of failure when using StatsLC has decreased to 0.2. That is an absolute decrease in risk of 0.4. Then we need to express this relative to the baseline. As we talked about the decrease in risk, it will be compared with the larger number, or 0.6, the risk of failing when using the StatsLC materials. So 0.4/0.6 = 0.667 or 66.7%. However, if we were talking about the increase in risk for NOT using StatsLC materials, then we would find 0.4/0.2 = 200%.

A great way to develop interaction and group discussion would be to give individuals in the group different information that is needed for the computation. Later on you could include one wrong “fact”, which they would need to ferret out. Another possibility would be to give students information about different scenarios that they need to present in the best or worst possible light.

These are great teaching opportunities, and worthwhile for everyday life.  It is a good thing they have been included in the NZ curriculum for year 12.

A note to regular readers – I will probably be posting less frequently for a while, but feel free to read back over some of my previous 95 posts if you miss the weekly rant. ;)

Those who can, teach statistics

The phrase I despise more than any in popular use (and believe me there are many contenders) is “Those who can, do, and those who can’t, teach.” I like many of the sayings of George Bernard Shaw, but this one is dismissive, and ignorant and born of jealousy. To me, the ability to teach something is a step higher than being able to do it. The PhD, the highest qualification in academia, is a doctorate. The word “doctor” comes from the Latin word for teacher.

Teaching is a noble profession, on which all other noble professions rest. Teachers are generally motivated by altruism, and often go well beyond the requirements of their job-description to help students. Teachers are derided for their lack of importance, and the easiness of their job. Yet at the same time teachers are expected to undo the ills of society. Everyone “knows” what teachers should do better. Teachers are judged on their output, as if they were the only factor in the mix. Yet how many people really believe their success or failure is due only to the efforts of their teacher?

For some people, teaching comes naturally. But even then, there is the need for pedagogical content knowledge. Teaching is not a generic skill that transfers seamlessly between disciplines. You must be a thinker to be a good teacher. It is not enough to perpetuate the methods you were taught with. Reflection is a necessary part of developing as a teacher. I wrote in an earlier post, “You’re teaching it wrong”, about the process of reflection. Teachers need to know their material, and keep up-to-date with ways of teaching it. They need to be aware of ways that students will have difficulties. Teachers, by sharing ideas and research, can be part of a communal endeavour to increase both content knowledge and pedagogical content knowledge.

There is a difference between being an explainer and being a teacher. Sal Khan, maker of the Khan Academy videos, is a very good explainer. Consequently many students who view the videos are happy that elements of maths and physics that they couldn’t do, have been explained in such a way that they can solve homework problems. This is great. Explaining is an important element in teaching. My own videos aim to explain in such a way that students make sense of difficult concepts, though some videos also illustrate procedure.

Teaching is much more than explaining. Teaching includes awakening a desire to learn and providing the experiences that will help a student to learn.  In these days of ever-expanding knowledge, a content-driven approach to learning and teaching will not serve our citizens well in the long run. Students need to be empowered to seek learning, to criticize, to integrate their knowledge with their life experiences. Learning should be a transformative experience. For this to take place, the teachers need to employ a variety of learner-focussed approaches, as well as explaining.

It cracks me up, the way sugary cereals are advertised as “part of a healthy breakfast”. It isn’t exactly lying, but the healthy breakfast would do pretty well without the sugar-filled cereal. Explanations really are part of a good learning experience, but need to be complemented by discussion, participation, practice and critique.  Explanations are like porridge – healthy, but not a complete breakfast on their own.

Why statistics is so hard to teach

“I’m taking statistics in college next year, and I can’t wait!” said nobody ever!

Not many people actually want to study statistics. Fortunately many people have no choice but to study statistics, as they need it. How much nicer it would be to think that people were studying your subject because they wanted to, rather than because it is necessary for psychology/medicine/biology etc.

In New Zealand, with the changed school curriculum that gives greater focus to statistics, there is a possibility that one day students will be excited to study stats. I am impressed at the way so many teachers have embraced the changed curriculum, despite limited resources, and late changes to assessment specifications. In a few years as teachers become more familiar with and start to specialise in statistics, the change will really take hold, and the rest of the world will watch in awe.

In the meantime, though, let us look at why statistics is difficult to teach.

  1. Students generally take statistics out of necessity.
  2. Statistics is a mixture of quantitative and communication skills.
  3. It is not clear which are right and wrong answers.
  4. Statistical terminology is both vague and specific.
  5. It is difficult to get good resources, using real data in meaningful contexts.
  6. One of the basic procedures, hypothesis testing, is counter-intuitive.
  7. Because the teaching of statistics is comparatively recent, there is little developed pedagogical content knowledge. (Though this is growing)
  8. Technology is forever advancing, requiring regular updating of materials and teaching approaches.

On the other hand, statistics is also a fantastic subject to teach.

  1. Statistics is immediately applicable to life.
  2. It links in with interesting and diverse contexts, including subjects students themselves take.
  3. Studying statistics enables class discussion and debate.
  4. Statistics is necessary and does good.
  5. The study of data and chance can change the way people see the world.
  6. Technlogical advances have put the power for real statistical analysis into the hands of students.
  7. Because the teaching of statistics is new, individuals can make a difference in the way statistics is viewed and taught.

I love to teach. These days many of my students are scattered over the world, watching my videos (for free) on YouTube. It warms my heart when they thank me for making something clear, that had been confusing. I realise that my efforts are small compared to what their teacher is doing, but it is great to be a part of it.