Statistics – Singular and Plural, Lies and Truth

Language is an issue in teaching and learning statistics. There are many words that have meanings in statistics, different from their everyday meaning, and even with multiple meanings within the study of statistics. Examples of troublesome words are: error, correlation, regression, significant, model. I wrote about addressing this in Teaching Statistical Language.

But the problem starts even with the name of the subject. There are at least three meanings for the term “statistics”. The word is not even consistently singular or plural. I suggest three meanings are: Data (plural), analysis (singular) and information (plural). What we teach focusses on the analysis, but involves data and information.

Statistics as Data

Sports people love statistics. Game shows and pub quizzes draw on data such as numbers of Olympic medals, wives, years of warfare, Oscars and a myriad other subjects. These statistics can be fascinating, relevant, boring or trivial. My most read blog post is entitled “Khan Academy Statistics videos are not good”. I suspect that quite a few people are searching for statistics about Khan Academy, rather than the subject of my post. This is borne out by the fact that a more recent post:  “Open Letter to Khan Academy about Basic Probability” gets considerably less traffic. I suppose there are not many people who want to know about the probability of Khan Academy. Pity – as the second post is better.

There is an entire discipline around “Official Statistics”. At a recent conference (ORSNZ/NZSA) I was fascinated by a presentation given about the need for statistics in a time of disaster and recovery. John Créquer talked about a subject close to my heart, the Christchurch earthquakes. In the weeks and months of the earthquakes authorities needed information of how many people there were of high need, in order to provide adequate service. Finding these numbers was an exercise in ingenuity and co-operation, drawing on data collected for other purposes. The presenter suggested that at times like that a national register would be invaluable. New Zealand does not have such a thing. It is an interesting conflict between the need for privacy and the public good. Créquer is a statistician from Statistics New Zealand, who has been contracted to CERA (The Canterbury Earthquake Recovery Authority) for now.  I had never thought that a statistician had uniquely valuable skills and insights to be used in a time of recovery from disaster.

The internet is an amazing source of the data kind of statistics. You can find out the number of an awful lot of things, simply by putting the question in a search box, or looking on Wikipedia. (I’ve made my annual monetary contribution – have you?). Thanks to Wikipedia, we don’t need to wonder about trivial things anywhere near as much as we used to.

Statistics as Analysis

Statistics, as it is taught and learned as a subject, mostly refers to statistical analysis and the inquiry process in which it is embedded. I sometimes wonder what people are thinking when I say that I produce materials to help people learn statistics. Do they imagine a classful of students memorising the populations of countries and batting averages?

“It is easy to lie with statistics. It is hard to tell the truth without it.”

This quote is from Andrejs Dunkels, a person whom I wish I had met. When I was looking for the source of this quote, I found a tribute page to a man who contributed greatly to the world of statistics. His quote uses statistics as a singular noun.

The analysis aspect of statistics involves taking raw data and turning it into information and evidence of what may be truth. Science would not progress far without the tools of statistics to take the raw results of experiments and observations, and using the insights gained by the mathematical world of probability, discern their significance. Without the discoveries and tools of statistics we would not be able to make sensible inference about populations from samples and experiments.

Statistical analysis uses mathematical tools, but is far more than just the mathematics. It is easy to produce wrong information by using the mechanistic calculations without thinking critically about the results. I once produced some very wrong models of performance of bank branches, using multiple regression. I even came up with some interesting rationalisations for the counter-intuitive results. Then I did a residual plot and found one outlier that changed everything! Once I removed it, the models changed to the extent that some of the coefficients changed sign. I wonder how many wrong models persist because of well-intentioned, but unskilled analysts.

There is a wonderful paragraph I used to quote in my second year statistical methods class, that unfortunately I can’t find – even using Wikipedia. It says, in essence: Statistical models are not sausage machines, taking in data and turning it into information without the interference of a human. If the results do not make sense and align with common understanding of the phenomenon, they are probably wrong.

If someone can direct me to the actual quote, I’d be very happy. I used to get the class to recite it in unison.

The point I am making is that the second meaning of statistics is a combination of science and art. It needs people.

Statistics as Information

This is similar to the first meaning, but I think that processed data should have a home separate from raw data. Statistical results include relationships and differences, not just “the facts.” I would put graphs and tables into this category. I think this category is scarier than statistics as data. Everyone can understand that Henry the Eight had six wives, and New Zealand won six gold medals at the London Olympics. Those are non-scary statistics, and easily accessible. They are statistics as data or facts.

What is more daunting to many people is the results of analysis. This is where we try to explain the population effect of cancer screening, the significance (statistical) of an increase or decrease in birthrate, the effect of seasonality on the sales of jewellery in the USA, the evidence that increasing numbers of cows are causing a degradation of water quality in natural water sources. These statistics need to be well presented. Part of our role as teachers is to help future producers of such information to be able to express themselves well so these statistics are accessible. Another part of our role is help future consumers of statistics to understand them.

Our role is important – for all three types of statistics.

Deterministic and Probabilistic models and thinking

The way we understand and make sense of variation in the world affects decisions we make.

Part of understanding variation is understanding the difference between deterministic and probabilistic (stochastic) models. The NZ curriculum specifies the following learning outcome: “Selects and uses appropriate methods to investigate probability situations including experiments, simulations, and theoretical probability, distinguishing between deterministic and probabilistic models.” This is at level 8 of the curriculum, the highest level of secondary schooling. Deterministic and probabilistic models are not familiar to all teachers of mathematics and statistics, so I’m writing about it today.


The term, model, is itself challenging. There are many ways to use the word, two of which are particularly relevant for this discussion. The first meaning is “mathematical model, as a decision-making tool”. This is the one I am familiar with from years of teaching Operations Research. The second way is “way of thinking or representing an idea”. Or something like that. It seems to come from psychology.

When teaching mathematical models in entry level operations research/management science we would spend some time clarifying what we mean by a model. I have written about this in the post, “All models are wrong.”

In a simple, concrete incarnation, a model is a representation of another object. A simple example is that of a model car or a Lego model of a house. There are aspects of the model that are the same as the original, such as the shape and ability to move or not. But many aspects of the real-life object are missing in the model. The car does not have an internal combustion engine, and the house has no soft-furnishings. (And very bumpy floors). There is little purpose for either of these models, except entertainment and the joy of creation or ownership. (You might be interested in the following video of the Lego Parisian restaurant, which I am coveting. Funny way to say Parisian!)

Many models perform useful functions. My husband works as a land-surveyor, and his work involves making models on paper or in the computer, of phenomenon on the land, and making sure that specified marks on the model correspond to the marks placed in the ground. The purpose of the model relates to ownership and making sure the sewers run in the right direction. (As a result of several years of earthquakes in Christchurch, his models are less deterministic than they used to be, and unfortunately many of our sewers ended up running the wrong way.)

Our world is full of models:

  • a map is a model of a location, which can help us get from place to place.
  • sheet music is a written model of the sound which can make a song
  • a bus timetable is a model of where buses should appear
  • a company’s financial reports are a model of one aspect of the company

Deterministic models

A deterministic model assumes certainty in all aspects. Examples of deterministic models are timetables, pricing structures, a linear programming model, the economic order quantity model, maps, accounting.

Probabilistic or stochastic models

Most models really should be stochastic or probabilistic rather than deterministic, but this is often too complicated to implement. Representing uncertainty is fraught. Some more common stochastic models are queueing models, markov chains, and most simulations.

For example when planning a school formal, there are some elements of the model that are deterministic and some that are probabilistic. The cost to hire the venue is deterministic, but the number of students who will come is probabilistic. A GPS unit uses a deterministic model to decide on the most suitable route and gives a predicted arrival time. However we know that the actual arrival time is contingent upon all sorts of aspects including road, driver, traffic and weather conditions.

Model as a way of thinking about something

The term “model” is also used to describe the way that people make sense out of their world. Some people have a more deterministic world model than others, contributed to by age, culture, religion, life experience and education. People ascribe meaning to anything from star patterns, tea leaves and moon phases to ease in finding a parking spot and not being in a certain place when a coconut falls. This is a way of turning a probabilistic world into a more deterministic and more meaningful world. Some people are happy with a probabilistic world, where things really do have a high degree of randomness. But often we are less happy when the randomness goes against us. (I find it interesting that farmers hit with bad fortune such as a snowfall or drought are happy to ask for government help, yet when there is a bumper crop, I don’t see them offering to give back some of their windfall voluntarily.)

Let us say the All Blacks win a rugby game against Australia. There are several ways we can draw meaning from this. If we are of a deterministic frame of mind, we might say that the All Blacks won because they are the best rugby team in the world.  We have assigned cause and effect to the outcome. Or we could take a more probabilistic view of it, deciding that the probability that they would win was about 70%, and that on the day they were fortunate.  Or, if we were Australian, we might say that the Australian team was far better and it was just a 1 in 100 chance that the All Blacks would win.

I developed the following scenarios for discussion in a classroom. The students can put them in order or categories according to their own criteria. After discussing their results, we could then talk about a deterministic and a probabilistic meaning for each of the scenarios.

  1. The All Blacks won the Rugby World Cup.
  2. Eri did better on a test after getting tuition.
  3. Holly was diagnosed with cancer, had a religious experience and the cancer was gone.
  4. A pet was given a homeopathic remedy and got better.
  5. Bill won $20 million in Lotto.
  6. You got five out of five right in a true/false quiz.

The regular mathematics teacher is now a long way from his or her comfort zone. The numbers have gone, along with the red tick, and there are no correct answers. This is an important aspect of understanding probability – that many things are the result of randomness. But with this idea we are pulling mathematics teachers into unfamiliar territory. Social studies, science and English teachers have had to deal with the murky area of feelings, values and ethics forever.  In terms of preparing students for a random world, I think it is territory worth spending some time in. And it might just help them find mathematics/statistics relevant!

Guest Post: Risk, Insurance and the Actuary

Risk, Insurance, and the Actuary

Risk is an inherent part of our daily life. As a result, most of us, take out insurance policies as a means of protection against scenarios which, were they to occur, may cause hardship whether for us or, as in the case of life insurance, for our families.

Insurance companies write many types of policies. The mutual risks of the policy holders are shared so that claims made against the policies can be covered at a much reduced cost. If priced fairly, then the premium reflects the contribution of the insured’s risk to overall risk.

As policy holders – we want the best price to cover the risk we are offloading; shareholders (again often us if we have superannuation)of the insurance company –require the premiums be sufficient to ensure the company stays in business.

It is then very important that analysts pricing the policies (and those calculating the required level of capital to meet the claim liabilities) have the statistical knowledge necessary to measure risk accurately! Understanding risk is even more critical in the framework of Solvency II (*) capital requirements (if it ever gets enforced).

The task is made more difficult as the duration of the policy life varies considerably. Some insurance cover is claimed against shortly after the incident occurs with a short processing time – automobile accidents for instance typically fit this category. This class of cover is termed short-tail liabilities as payments are completed within a short timeframe of the incident occurring.

Other cases arise many years after the original policy was taken out, or payments may occur many years after the original claim was raised – for example medical malpractice. These are termed long-tail liabilities as payments may be made long after the original policy was activated or the incident occurred. Due to the long forecast horizon and [generally] higher volatility in the claim amounts, long-tail liabilities are inherently more risky.

Life insurance is in its own category as everybody dies sometime.

Meet the data

For convenience, and because it is generally less well understood, we restrict our focus to long-tail liability insurance data

For each claim we have many attributes, but four that are universal to all claims: payment amount(s), incident date (when the originating event resulting in the claim occurred), payment date(s), and state of claim (are further payments possible or is the claim settled). These attributes allow the aggregation of the individual claim data into a series more amenable for analysis at the financial statement level where the volatility of individual claims should be largely eliminated since the risk is pooled.

Actuaries tend to present their data cumulatively in a table like this:

Actuarial tableWhere the rows are accident years, and the column index (development time in actuarial parlance) is the delay between the accident year and the year of payment.

Thus payments made in development lag 0 corresponds to all payments made toward claims in the year the accident occurred. The values in development lag 10 correspond to the sum of the payments made in the eleven years since the accident occurred.

This presentation likely arose for a number of reasons, but the most important two being:

  • Cumulative data are much easier to work with in the absence of computers;
  • Volatility is visibly less of an issue the further in the development tail when examining cumulatives.

The nature of the inherited data presentation produces some unfortunate consequences:

  • Variability is hard to quantify between parameter uncertainty and process volatility;
  • Calendar year effects (trends down the diagonals) are unable to be measured – and therefore readily predicted;
  • Parameter interpretation is difficult due to the calendar year confounding effects; and
  • Parsimony is hard to achieve.

The actuarial profession attempts to deal with each of these issues in various ways. For instance, the bootstrap is being used to quantify variability. Data may be indexed against inflation to partially account for calendar year trends.

Why spend time on this?

Fundamentally because, if you want to solve a problem, you first have to be sure that the data you are using and the way you are using it allows you to solve the problem! The profession has spent much time, energy, and analysis on developing techniques to solve the risk measurement problem but with the underlying assumption that cumulation is the way to analyse insurance data.

Aside: this is why I enjoy Genetic Programming – not because the algorithm allows the automatic generation of solutions, but rather because you have to formulate the problem very precisely in order to ensure the right problem is solved.

Understanding the problem

The objective of analysis of the Insurance portfolios is to quantify the expected losses incurred by the Insurance company and the volatility (the risk) associated with the portfolio so adequate money is raised to pay all liabilities, at a reasonable price, with an excellent profit. Additional benefits may arise like an improved understanding of the policies being written, targeting of more profitable customers, and so forth, but these are secondary.

Assume the data available are the loss data with the three attributes of accident time, calendar time, and payment. Forget about claim state for now though this is an important factor for future projections.

We immediately identify two time attributes. This suggests time series models are likely a good starting point for analysis. We also would examine the distribution(s) of incremental losses rather than cumulate the losses over time since cumulation of time series would hide the volatility of the losses at the individual time points – the very component that we are interested in.

Further, we need the ability to distinguish between parameters, parameter uncertainty, and the process volatility. Process volatility and parameter uncertainty drive the critical risk metrics which are essential to ensuring adequate capital is set aside to not only cover the expected losses, but also allow for the unexpected losses should they occur.

Beginning with this foundation, modelling techniques which take the fundamental time-series nature of the data into account are almost certain to provide superior performance to methodologies which mask (for historical reasons mentioned) the time series nature of the data.

Is this new?

Actually, no. All the above considerations of analysis of P&C insurance data were presented many years ago. However, time series approaches are not typically taught to aspiring P&C actuaries. Why?

Perhaps several reasons:

  • Tradition. Like any specialised profession, a system is developed to provide solutions and unless the system is convincingly broken, the uptake of new methodology is resisted.
  • Statistical analysis is complicated. Applying standard formula to get answers is “easy” when you know the formula.

The catch

Misrepresenting data leads to a flawed model representing the underlying data processes.

The likelihood of such a methodology resulting in the correct mean or a correct measure of the volatility is extremely low. The distributional assumptions are likely completely spurious as the fundamental nature of the data is not recognised.

Wrong model = wrong conclusion, unless you’re unlucky

It is often a general problem where the wrong statistical technique is applied to solve a statistical problem. This suggestion the statement: “All models are wrong, but some are useful.” This is not entirely fair in my mind as it (wrongly) places the blame on the model where the blame should actually be on the analyst and their choice of the modelling method.

Although we will never find the model driving the underlying data generating process, nevertheless, we can often well approximate the data process (otherwise modelling of any kind would be pointless). These are the useful models. Then you are only unlucky if your model looks like it is useful, but fails when it comes to prediction.

In summary

  • The problem of quantifying risk is not a simple exercise
  • Insurance data is fundamentally financial time series data
  • The right starting point is critical to any statistical analysis
  • We statisticians need to explain our solutions in a way that is meaningful to established professions

(*) In essence, Solvency II comprises insurance legislation aiming to improve policyholder protection by introducing a clear, comprehensive framework for a market consistent, risk model. In particular, insurance companies must be able to withstand a 1/200 year loss event in the next calendar year encompassing all levels of risk sources – insurance and reserve risk, catastrophe risk, operational risk, default risk to name a few.  Quantitative impact study documents are available here; a general discussion of Solvency II can be found here. The legislation has been postponed many times.

About David Munroe

David Munroe leads Insureware’s outstanding statistical department. Comments in this article are the authors own and do not necessarily represent the position of Insureware Pty Ltd.

He completed an Masters degree in Statistics (with First Class Honours) from Massey University, New Zealand.

David has experience in statistical and actuarial analysis along with C++ programming knowledge. Previous projects include working with a Canadian Insurance company to software training and implementation purposes resulting in significant modelling improvements (regions can be modelled within a working day allowing analysts to focus on providing extracted insights to management).

David studied the art of Shaolin Kempo for over nine years, holds a second degree black belt, and is qualified in the use of Okinawan weaponry. He is also interested in music (piano), literature, photography, and self sufficiency. He also has two children on the autism spectrum.

Analysis of “Deal or No Deal” results

Deal or No Deal

My son, Jonathan, loves game-shows, and his current favourite is Deal or No Deal, the Australian version. It has been airing now for over ten years, and there is at least one episode available every weeknight on New Zealand television. I often watch it with him as it is a nice time to spend together. We discuss whether people should take the deal or not, and guess what the bank offer will be. There are other followers of the programme, equally devoted, and I am grateful to Paul Corfiatis and his mum who fastidiously collected data for all the 215 programmes in 2009 on the final takings, the case chosen and the case containing the $200,000. In this post I analyse this data, and give some ideas of how this can be used in teaching.

Deal or No Deal, explained

You can find out ALL about Deal or No Deal on Wikipedia. I was excited to see our New Zealand radio gameshow, “The Money or the Bag” given as an antecedent.  There are numerous incarnations of the game. The basic idea is that there are 26 cases, containing a range of money values from 50c to $200,000. The money values are randomly assigned and their allocation unknown to the contestant and the “banker”. The contestant chooses one of the cases, and chats to the host, Andrew O’Keefe, about what they will do with the money when they win. The usual responses are to have a big wedding or travel. As the programme is filmed in Melbourne, often second generation Australians are wanting to visit their parents’ homeland.   Usually the contestant has a friend or family member as a podium player, who interacts as part of the banter. In the first round, the player chooses six cases to open, thus gaining information about the possible value in their case. At the end of the round, the banker offers a sum of money to buy back the case from the contestant, who must choose, “Deal” (take the money) or “No Deal”, keep the case and its contents. In the second round five cases are opened and then there is another bank offer. This continues until the sixth round, and from then the cases are opened one at a time, with an offer made after each one. The player either takes the deal at some point, or holds out until the end, at which point they take the contents of the case. There are other variants on this basic game, to add variety.

Human aspects

My son is blind and has autism, and finds much to like about this programme. He likes the order of it all – every night, a very similar drama is played out, and he can understand exactly what is happening. He also likes the agony and the joy. He gets very excited when the case containing $200,000 is opened with the special sound effect, and Andrew says, “Oh No”. He likes hearing about the people, and their lives and he likes that you never know how much you might win.

I also like the drama and the joy, but I’d rather not watch when it is going badly. I like it because it is an insight into people’s perceptions of chance. Like many people, I yell at the screen, telling them to take the deal when we see them being reckless, but I am usually happy when their foolish decisions turn out well.  To me it is a true reality show – not because the situation is in any way like reality, but because the people are authentic in their responses. I have been known to weep when a nice person wins a sizeable amount of money. One day I would love to go on the show, as I know how much joy that would bring Jonathan, to be a part of it.

Part of the appeal is the collective experience of it all. The podium players, the audience and the people at home feel connected to the main contestant. One episode that Jonathan loves to tell people about is with Josh Sharpe who was REALLY unlucky. You can see that here on YouTube:

The probability

The probability calculation for Deal or No Deal is very simple. The contestant has one chance in twenty-six that their case contains the big prize. They have four chances in twenty-six that their case contains a prize of $50,000 or more. The expected value of their prize, if they hold onto their case to the end, is about $19,900 (valuing the car at $30,000). When the dealer makes an offer, it is often around the expected value of the remaining unopened cases. (The average amount left.) There are times when the offer is considerably lower or higher than the expected value, which seems to be in an effort to push the contestant one way or the other. Contestants very seldom take the deal in the early rounds of the game.

There are a number of interesting questions we can explore:

  • What is the distribution of the actual outcomes for contestants?
  • How often do contestants do better than what is in their case?
  • Are there any “lucky” cases that contain the big prize more often than others?

To explore these questions I am using the data so diligently collected by Paul Corfiatis. I will use data from games with the regular list of prizes, not “Fantastic Four”, which has some more high value cases.

What is the actual outcome for the contestants?

The following graph shows the amount of money the contestants win, either by taking the deal or hanging out for the case.


You could have an interesting discussion about the factors to account for in looking at this. You would expect the mean to be lower for the “case” prizes, as they tend to be people who have kept going to the bitter end. There is a very large standard deviation.

Here is a table of results:

Case Deal Either case or deal
Number of instances 53 146 199
Mean $6139 $21,044 $17,075
Median $500 $18,350 $15,000
Standard Deviation $13,740 $14,499 $15,721
Minimum $0.50 $950 $0.50
Maximum $50,000 $100,000 $100,000

How often do contestants do better than what is in their case?

For this I calculated the prize less the amount that was in their case. The mean value was $1082, with a median of $9969.50, minimum of -$170,050 and a maximum of $99,995. Contestants who took the deal, did better, 106 times out of 146, or 73% of the time.

Lucky Cases

And of course the one to make the statisticians smile – are there any lucky cases?

Here is a graph of the distribution of cases that held the $200,000. I am tempted to make glib comments about how clearly 14 is a lucky case, so you should pick that one, but then, maybe you should pick 19, as it hasn’t had the $200,000 much. But as you never know who is going to quote you, I’d better not.

Which case contained the $200,000 in 2007.

Which case contained the $200,000 in 2007.

Educational use for this

Depending on how much you wish to torment your students, and the educational objectives, you could give them the raw data, as provided on the site, and see what they come up with.  Or you could simply present the results given in this post, watch an episode, and discuss what meanings people could take from the data, and what misconceptions might occur.

About blogging

This is the 100th post on “Learn and Teach Statistics and Operations Research”. To celebrate, I am writing about the joys of blogging.

Anyone with an internet connection can blog these days, and do! It is the procrastinator’s “dark playground” to read blogs on pretty much anything you want to know. (For an explanation, with pictures, of the dark playground, where the instant gratification monkey holds sway until the panic monster arrives, see this entertaining post: Why Procrastinators Procrastinate.)

I started to blog to build a reputation for knowing about teaching statistics and operations research. This would lead people to buy our apps, subscribe to our on-line materials and watch my YouTube videos. Many blogs are set up, like this, in order to build credibility and presence on the internet. I’ve found it quite exciting to watch the readership grow, and I particularly love it when people comment. I also like to feel that I am doing some good in the world. The process of writing is also a learning process for me.

Here are some lightly structured thoughts about what I’ve learned over the last 99 posts.

A blog is not a scholarly research paper

As I come from an academic background, I have had to remind myself that a blog is different from a scholarly research paper. A blog isn’t scholarly, it isn’t based on research (unless you can call time in the shower that) and it isn’t on paper.

Blogging rewards bad behaviour.

The more opinionated you are, and the less evidence you use to support your argument, the more readers you get.  You must remove equivocation. Often after I write my first draft, I go through and remove statements like “in my opinion” or  “it seems”.  This is the antithesis of a scholarly paper, which must be carefully stated in balanced and measured tones.

Blogs are personal

It is good to be personal in a blog. In journal articles we avoid the use of first person language as if the paper were somehow written by itself. This can give rise to convoluted sentence structures and endless passive voice. When I write my blog, I talk about my own ideas, and even aspects of my life. I mention side tracks, and give a little bit of myself. And I prefer to read blogs that have a bit of the author in them. I think you need a little touch of narcissism to enjoy blogging.

Quantity is more important than quality

Volume in blogging dominates quality. Some might argue that this is also true for academic papers. In a blog you are better to dash off one opinion piece a week, than put the same effort into one scholarly paper. If one falls flat, it really doesn’t matter.

Blogs give instant gratification

Blogs have a quick turn-around, ideal for people with short attention spans who want instant gratification. In academia the delay between doing the research and seeing it in print is measured in years. By the time an article has been through the review process, you have almost forgotten why you did the research in the first place. And don’t really care anymore. But when you blog and click “Publish”, it is out there in the world for all to see.

People read blogs

People read your blog. It is an amazing feeling to send out my thoughts into the world and watch the viewing stats on WordPress, knowing that hundreds and sometimes even thousands of people are reading my opinion, literally all over the world. And sometimes I even get emails from fans, telling how my post has helped them or inspired them to work or do research in the area of statistics education. Or else I find that an educational institution has set a link to one of my posts for their students to read. In contrast I wonder if anyone has ever read my journal articles, apart from the reviewers. Not only do people read your blog, but you can see where they live and what they read, and even what search engine terms brought you to the blog. Some search terms boggle the mind, first that someone entered them, and secondly that they led to my blog! The term “rocks” has led to my site 66 times in the last two years, which I am sure was disappointing for the searcher. The most common search term is “causation”.

Blogging does not get you promoted

Though blogging is fun and great for attention-seekers, it does not improve your PBRF ratings (in NZ) or whatever the measure of publication activity is in a specific country. Nor does blogging count for promotion or tenure. This may be simply a matter of time to allow attitudes to change, as erudite blogs can get scientific findings out into the public domain far more rapidly than the old print-based system.

People can be mean

A blogger needs to have a thick skin. I don’t yet, and have to remind myself that I didn’t research my article, so it is only fair for people to offer opposing views. In fact, one the great qualities of a blog is that anyone can respond and improve the quality of the blog. I love it when people leave comments; it is the emailed “hate-messages” that are a bit upsetting.

Keynote speaker

One spin off of a successful blog is that you get asked to be a keynote speaker.

Actually I’m kidding on that one. I’d love to be a keynote speaker, and I’m pretty sure I could entertain a crowd and give them something to think about for an hour or so, but it hasn’t happened. Yet. Any invitations?

Proving causation

Aeroplanes cause hot weather

In Christchurch we have a weather phenomenon known as the “Nor-wester”, which is a warm dry wind, preceding a cold southerly change. When the wind is from this direction, aeroplanes make their approach to the airport over the city. Our university is close to the airport in the direct flightpath, so we are very aware of the planes. A new colleague from South Africa drew the amusing conclusion that the unusual heat of the day was caused by all the planes flying overhead.

Statistics experts and educators spend a lot of time refuting claims of causation. “Correlation does not imply causation” has become a catch cry of people trying to avoid the common trap. This is a great advance in understanding that even journalists (notoriously math-phobic) seem to have caught onto. My own video on important statistical concepts ends with the causation issue. (You can jump to it at 3:51)

So we are aware that it is not easy to prove causation.

In order to prove causation we need a randomised experiment. We need to make random any possible factor that could be associated, and thus cause or contribute to the effect.

There is also the related problem of generalizability. If we do have a randomised experiment, we can prove causation. But unless the sample is also a random representative sample of the population in question, we cannot infer that the results will also transfer to the population in question. This is nicely illustrated in this matrix from The Statistical Sleuth by Fred L. Ramsey and Daniel W Schafer.

The relationship between the type of sample and study and the conclusions that may be drawn.

The relationship between the type of sample and study and the conclusions that may be drawn.

The top left-hand quadrant is the one in which we can draw causal inferences for the population.

Causal claims from observational studies

A student posed this question:  Is it possible to prove a causal link based on an observational study alone?

It would be very useful if we could. It is not always possible to use a randomised trial, particularly when people are involved. Before we became more aware of human rights, experiments were performed on unsuspecting human lab rats. A classic example is the Vipeholm experiments where patients at a mental hospital were the unknowing subjects. They were given large quantities of sweets in order to determine whether sugar caused cavities in teeth. This happened into the early 1950s. These days it would not be acceptable to randomly assign people to groups who are made to smoke or drink alcohol or consume large quantities of fat-laden pastries. We have to let people make those lifestyle choices for themselves. And observe. Hence observational studies!

There is a call for “evidence-based practice” in education to follow the philosophy in medicine. But getting educational experiments through ethics committee approval is very challenging, and it is difficult to use rats or fruit-flies to impersonate the higher learning processes of humans. The changing landscape of the human environment makes it even more difficult to perform educational experiments.

To find out the criteria for justifying causal claims in an observational study I turned to one of my favourite statistics text-books, Chance Encounters by Wild and Seber  (page 27). They cite the Surgeon General of the United States. The criteria for the establishment of a cause and effect relationship in an epidemiological study are the following:

  1. Strong relationship: For example illness is four times as likely among people exposed to a possible cause as it is for those who are not exposed.
  2. Strong research design
  3. Temporal relationship: The cause must precede the effect.
  4. Dose-response relationship: Higher exposure leads to a higher proportion of people affected.
  5. Reversible association: Removal of the cause reduces the incidence of the effect.
  6. Consistency: Multiple studies in different locations producing similar effects
  7. Biological plausibility: there is a supportable biological mechanism
  8. Coherence with known facts.

Teaching about causation

In high school, and entry-level statistics courses, the focus is often on statistical literacy. This concept of causation is pivotal to correct understanding of what statistics can and cannot claim. It is worth spending some time in the classroom discussing what would constitute reasonable proof and what would not. In particular it is worthwhile to come up with alternative explanations for common fallacies, or even truths in causation. Some examples for discussion might be drink-driving and accidents, smoking and cancer, gender and success in all number of areas, home game advantage in sport, the use of lucky charms, socks and undies. This also ties nicely with probability theory, helping to tie the year’s curriculum together.

Absolute and Relative Risk

It is important that citizens can make sense out of the often outrageous claims of advertisers and pro-screening advocates.  It isn’t what they say, but how they say it. What looks like a very large and scary increase in risk, can in fact make very little practical difference. Conversely a large risk can be made to look smaller through the manner in which it is communicated.

I found a wonderful set of notes on the Census at School site, presented as a powerpoint file.

I also found several very interesting and educational sites about risk.

This first one explains about risk and relative risk: Science blog on Cancer Research UK

This one also includes Number needed to treat. Patient Health UK.

And a here is a great summary and set of exercises at the Auckland Maths Association website. You need to scroll down to “Relative Risk Resources”. (I found this after writing the rest of the blog, and it pretty much says what I say, but more succinctly!)

Teaching about Risk

Risk is a great topic for teaching about probability, percentages and perception.

It’s what’s on the bottom that counts!

In exploring risk, there are several distinct processes needed. Depending on the format in which the information is given, students may need to construct their own frequency table, or interpret the one provided. From the frequency table they must calculate the probability, making sure that they choose the correct denominator. Then if they are looking for relative risk, they need to make sure that they again choose the correct denominator. For some reason, the numerator is usually easier. But what can be tricky is the denominator.

We can use as an example the increase in probability of passing a particular statistics course if students use our Statistics Learning Centre materials to help them. We haven’t collected any data yet, so these figures are aspirational (as in a work of fiction!). Because we are talking about risk, we have to frame the outcome in negative terms. We would not talk about the risk of passing a course, but rather of failing one. So we will say that students who use StatsLC materials reduce their risk of failing by 66.7% percent. That is pretty impressive, but how much better it sounds if we frame it in terms of how much their risk will increase if they decide not to use the wonderful materials from StatsLC. Their risk of failure increases by 200%. That sounds pretty drastic.

But what we have failed to mention is the absolute risk, which is the proportion of students who fail their stats courses with and without the help of StatsLC. Here are some pairs of absolute risks that will give the results given:

All of the following sets of numbers show a 200% increase in risk of failure for students who do not use StatsLC materials.


Risk of failing, when using StatsLC materials

Risk of failing when they don’t use StatsLC materials

Actual increase in risk of failing.













In Scenario A, the pass-rate for the statistics course has gone from 97% to 99%. In scenario B, the pass-rate has gone from 70% to 90%, and in Scenario C, the pass-rate has gone from 40% to 80%. All of these scenarios could accurately be described by the same change in relative risk. They all double the risk of failing if the student does not use StatsLC.

This is really at the end of the story, based on what is reported. But if we wish to find out what is really going on, the best idea is to build a table of natural frequencies. These are great for calculating conditional probabilities by stealth.

Here is a table of natural frequencies for Scenario C above, using 1000 as our total number of people. Before we fill it out, we also need to know how many people used Statistics Learning Centre materials. 30% of students did NOT use StatsLC materials.



Total in category

Use StatsLC

80% of 700 = 560

20% of 700 = 140


Do not use StatsLC

40% of 300 = 120

60% of 300 = 180


Total pass or fail




From this table, all manner of statistics can be computed.

What proportion of students who passed, used the StatsLC materials?

The answer is (the number of people who passed AND used StatsLC materials)/( the number of people who passed) = 560/680 =82%. It is important to find the correct denominator.

Then when people calculate relative risk, it is important to be careful about choosing the baseline.

Another question might be, by how much does your risk of failure decrease, in relative terms, if you use the StatsLC materials?

The first step is to find the decrease in absolute terms. The risk of failure, not using StatsLC = 0.6. The risk of failure when using StatsLC has decreased to 0.2. That is an absolute decrease in risk of 0.4. Then we need to express this relative to the baseline. As we talked about the decrease in risk, it will be compared with the larger number, or 0.6, the risk of failing when using the StatsLC materials. So 0.4/0.6 = 0.667 or 66.7%. However, if we were talking about the increase in risk for NOT using StatsLC materials, then we would find 0.4/0.2 = 200%.

A great way to develop interaction and group discussion would be to give individuals in the group different information that is needed for the computation. Later on you could include one wrong “fact”, which they would need to ferret out. Another possibility would be to give students information about different scenarios that they need to present in the best or worst possible light.

These are great teaching opportunities, and worthwhile for everyday life.  It is a good thing they have been included in the NZ curriculum for year 12.

A note to regular readers – I will probably be posting less frequently for a while, but feel free to read back over some of my previous 95 posts if you miss the weekly rant. ;)

Those who can, teach statistics

The phrase I despise more than any in popular use (and believe me there are many contenders) is “Those who can, do, and those who can’t, teach.” I like many of the sayings of George Bernard Shaw, but this one is dismissive, and ignorant and born of jealousy. To me, the ability to teach something is a step higher than being able to do it. The PhD, the highest qualification in academia, is a doctorate. The word “doctor” comes from the Latin word for teacher.

Teaching is a noble profession, on which all other noble professions rest. Teachers are generally motivated by altruism, and often go well beyond the requirements of their job-description to help students. Teachers are derided for their lack of importance, and the easiness of their job. Yet at the same time teachers are expected to undo the ills of society. Everyone “knows” what teachers should do better. Teachers are judged on their output, as if they were the only factor in the mix. Yet how many people really believe their success or failure is due only to the efforts of their teacher?

For some people, teaching comes naturally. But even then, there is the need for pedagogical content knowledge. Teaching is not a generic skill that transfers seamlessly between disciplines. You must be a thinker to be a good teacher. It is not enough to perpetuate the methods you were taught with. Reflection is a necessary part of developing as a teacher. I wrote in an earlier post, “You’re teaching it wrong”, about the process of reflection. Teachers need to know their material, and keep up-to-date with ways of teaching it. They need to be aware of ways that students will have difficulties. Teachers, by sharing ideas and research, can be part of a communal endeavour to increase both content knowledge and pedagogical content knowledge.

There is a difference between being an explainer and being a teacher. Sal Khan, maker of the Khan Academy videos, is a very good explainer. Consequently many students who view the videos are happy that elements of maths and physics that they couldn’t do, have been explained in such a way that they can solve homework problems. This is great. Explaining is an important element in teaching. My own videos aim to explain in such a way that students make sense of difficult concepts, though some videos also illustrate procedure.

Teaching is much more than explaining. Teaching includes awakening a desire to learn and providing the experiences that will help a student to learn.  In these days of ever-expanding knowledge, a content-driven approach to learning and teaching will not serve our citizens well in the long run. Students need to be empowered to seek learning, to criticize, to integrate their knowledge with their life experiences. Learning should be a transformative experience. For this to take place, the teachers need to employ a variety of learner-focussed approaches, as well as explaining.

It cracks me up, the way sugary cereals are advertised as “part of a healthy breakfast”. It isn’t exactly lying, but the healthy breakfast would do pretty well without the sugar-filled cereal. Explanations really are part of a good learning experience, but need to be complemented by discussion, participation, practice and critique.  Explanations are like porridge – healthy, but not a complete breakfast on their own.

Why statistics is so hard to teach

“I’m taking statistics in college next year, and I can’t wait!” said nobody ever!

Not many people actually want to study statistics. Fortunately many people have no choice but to study statistics, as they need it. How much nicer it would be to think that people were studying your subject because they wanted to, rather than because it is necessary for psychology/medicine/biology etc.

In New Zealand, with the changed school curriculum that gives greater focus to statistics, there is a possibility that one day students will be excited to study stats. I am impressed at the way so many teachers have embraced the changed curriculum, despite limited resources, and late changes to assessment specifications. In a few years as teachers become more familiar with and start to specialise in statistics, the change will really take hold, and the rest of the world will watch in awe.

In the meantime, though, let us look at why statistics is difficult to teach.

  1. Students generally take statistics out of necessity.
  2. Statistics is a mixture of quantitative and communication skills.
  3. It is not clear which are right and wrong answers.
  4. Statistical terminology is both vague and specific.
  5. It is difficult to get good resources, using real data in meaningful contexts.
  6. One of the basic procedures, hypothesis testing, is counter-intuitive.
  7. Because the teaching of statistics is comparatively recent, there is little developed pedagogical content knowledge. (Though this is growing)
  8. Technology is forever advancing, requiring regular updating of materials and teaching approaches.

On the other hand, statistics is also a fantastic subject to teach.

  1. Statistics is immediately applicable to life.
  2. It links in with interesting and diverse contexts, including subjects students themselves take.
  3. Studying statistics enables class discussion and debate.
  4. Statistics is necessary and does good.
  5. The study of data and chance can change the way people see the world.
  6. Technlogical advances have put the power for real statistical analysis into the hands of students.
  7. Because the teaching of statistics is new, individuals can make a difference in the way statistics is viewed and taught.

I love to teach. These days many of my students are scattered over the world, watching my videos (for free) on YouTube. It warms my heart when they thank me for making something clear, that had been confusing. I realise that my efforts are small compared to what their teacher is doing, but it is great to be a part of it.

On-line learning and teaching resources

Twenty-first century Junior Woodchuck Guidebook

I grew up reading Donald Duck comics. I love the Junior Woodchucks, and their Junior Woodchuck Guidebook. The Guidebook is a small paperback book, containing information on every conceivable subject, including geography, mythology, history, literature and the Rubaiyat of Omar Khayyam.  In our family, when we want to know something or check some piece of information, we talk about consulting the Junior Woodchuck Guidebook. (Imagine my joy when I discovered that a woodchuck is another name for a groundhog, the star of my favourite movie!) What we are referring to is the internet, the source of all possible information! Thanks to search engines, there is very little we cannot find out on the internet. And very big thanks to Wikipedia, to which I make an annual financial contribution, as should all who use it and can afford to.

You can learn just about anything on the internet. Problem is, how do you know what is good? And how do you help students find good stuff? And how do you use the internet wisely? And how can it help us as learners and teachers of statistics and operations research? These questions will take more than my usual 1000 words, so I will break it up a bit. This post is about the ways the internet can help in teaching and learning. In a later post I will talk about evaluating resources, and in particular multimedia resources.


Both the disciplines in which I am interested, statistics and operations research, apply mathematical and analytic methods to real-world problems. In statistics we are generally trying to find things out, and in operations research we are trying to make them better. Either way, the context is important. The internet enables students to find background knowledge regarding the context of the data or problem they are dealing with. It also enables instructors to write assessments and exercises that have a degree of veracity to them even if the actual raw data proves elusive. How I wish people would publish standard deviations as well as means when reporting results!


Which brings us to the second use for on-line resources. Real problems with real data are much more meaningful for students, and totally possible now that we don’t need to calculate anything by hand. Sadly, it is more difficult than first appears to find good quality raw data to analyse, but there is some available. You can see some sources in a previous post and the helpful comments.


If you are struggling to understand a concept, or to know how to teach or explain it, do a web search. I have found some great explanations, and diagrams especially, that have helped me. Or I have discovered a dearth of good diagrams, which has prompted me to make my own.


Videos can help with background knowledge, with explanations, and with inspiring students with the worth of the discipline. The problem with videos is that it takes a long time to find good ones and weed out the others. One suggestion is to enlist the help of your students. They can each watch two or three videos and decide which are the most helpful. The teacher then watches the most popular ones to check for pedagogical value. It is great when you find a site that you can trust, but even then you can’t guarantee the approach will be compatible with your own.

Social support

I particularly love Twitter, from which I get connection with other teachers and learners, and ideas and links to blogs. I belong to a Facebook group for teachers of statistics in New Zealand, and another Facebook group called “I love Operations Research”. These wax and wane in activity, and can be very helpful at times. Students and teachers can gain a lot from social networking.


There is good open-source software available, and 30-day trial versions for other software. Many schools in New Zealand use the R-based iNZight collection of programmes, which provide purpose-built means for timeseries analysis, bootstrapping and line fitting.

Answers to questions

The other day I lost the volume control off my toolbar. (Windows Vista, I’m embarrassed to admit). So I put in the search box “Lost my volume control” and was directed to a YouTube video that took me step-by-step through the convoluted process of reinstating my volume control! I was so grateful I made a donation. Just about any computer related question can be answered through a search.

Interactive demonstrations

I love these. There are two sites I have found great:

The National Library of Virtual Manipulatives, based in Utah.

NRich – It has some great ideas in the senior statistics area. From the UK.

A problem with some of these is the use of Flash, which does not play on all devices. And again – how do we decide if they are any good or not?

On-line textbooks

Why would you buy a textbook when you can get one on-line. I routinely directed my second-year statistical methods for business students to “Concepts and Applications of Inferential Statistics”. I’ve found it just the right level. Another source is Stattrek. I particularly like their short explanations of the different probability distributions.

Practice quizzes

There aren’t too many practice quizzes  around for free. Obviously, as a provider of statistical learning materials, I believe quizzes and exercises have merit for practice with immediate and focussed feedback. However, it can be very time-consuming to evaluate practice quizzes, and some just aren’t very good. On the other hand, some may argue that any time students spend learning is better than none.

Live help

There are some places that provide live, or slightly delayed help for students. I got hooked into a very fun site where you earned points by helping students. Sadly I can’t find it now, but as I was looking I found vast numbers of on-line help sites, often associated with public libraries. And there are commercial sites that provide some free help as an intro to their services. In New Zealand there is the StudyIt service, which helps students preparing for assessments in the senior high school years. At StatsLC we provide on-line help as part of our resources, and will be looking to develop this further. From time to time I get questions as a result of my YouTube videos, and enjoy answering them ,unless I am obviously doing someone’s homework! I also discovered “ShowMe” which looks like a great little iPad app, that I can use to help people more.

This has just been a quick guide to how useful the internet can be in teaching and learning. Next week I will address issues of quality and equity.

How to learn statistics (Part 2)

Some more help (preaching?) for students of statistics

Last week I outlined the first five principles to help people to learn and study statistics.

They focussed on how you need to practise in order to be good at statistics and you should not wait until you understand it completely before you start applying. I sometimes call this suspending disbelief. Next I talked about the importance of context in a statistical investigation, which is one of the ways that statistics is different from pure mathematics. And finally I stressed the importance of technology as a tool, not only for doing the analysis, but for exploring ideas and gaining understanding.

Here are the next five principles (plus 2):

6. Terminology is important and at times inconsistent

There are several issues with regard to statistical terminology, and I have written a post with ideas for teachers on how to teach terminology.

One issue with terminology is that some words that are used in the study of statistics have meanings in everyday life that are not the same. A clear example of this is the word, “significant”. In regular usage this can mean important or relevant, yet in statistics, it means that there is evidence that an effect that shows up in the sample also exists in the population.

Another issue is that statistics is a relatively young science and there are inconsistencies in terminology. We just have to live with that. Depending on the discipline in which the statistical analysis is applied or studied, different terms can mean the same thing, or very close to it.

A third language problem is that mixed in with the ambiguity of results, and judgment calls, there are some things that are definitely wrong. Teachers and examiners can be extremely picky. In this case I would suggest memorising the correct or accepted terminology for confidence intervals and hypothesis tests. For example I am very fussy about the explanation for the R-squared value in regression. Too often I hear that it says how much of the dependent variable is explained by the independent variable. There needs to be the word “variation” inserted in there to make it acceptable. I encourage my students to memorise a format for writing up such things. This does not substitute for understanding, but the language required is precise, so having a specific way to write it is fine.

This problem with terminology can be quite frustrating, but I think it helps to have it out in the open. Think of it as learning a new language, which is often the case in new subject. Use glossaries, to make sure you really do know what a term means.

7. Discussion is important

This is linked with the issue of language and vocabulary. One way to really learn something is to talk about it with someone else and even to try and teach it to someone else. Most teachers realise that the reason they know something pretty well, is because they have had to teach it. If your class does not include group work, set up your own study group. Talk about the principles as well as the analysis and context, and try to use the language of statistics. Working on assignments together is usually fine, so long as you write them up individually, or according to the assessment requirements.

8. Written communication skills are important

Mathematics has often been a subject of choice for students who are not fluent in English. They can perform well because there is little writing involved in a traditional mathematics course. Statistics is a different matter, though, as all students should be writing reports. This can be difficult at the start, but as students learn to follow a structure, it can be made more palatable. A statistics report is not a work of creative writing, and it is okay to use the same sentence structure more than once. Neither is a statistics report a narrative of what you did to get to the results. Generous use of headings makes a statistical report easier to read and to write. A long report is not better than a short report, if all the relevant details are there.

9. Statistics has an ethical and moral aspect

This principle is interesting, as many teachers of statistics come from a mathematical background, and so have not had exposure to the ethical aspects of research themselves. That is no excuse for students to park their ethics at the door of the classroom. I will be pushing for more consideration of ethical aspects of research as part of the curriculum in New Zealand. Students should not be doing experiments on human subjects that involve delicate subjects such as abuse, or bullying. They should not involve alcohol or other harmful substances. They should be aware of the potential to do harm, and make sure that any participants have been given full information and given consent. This can be quite a hurdle, but is part of being an ethical human being. It also helps students to be more aware when giving or withholding consent in medical and other studies.

10. The study of statistics can change the way you view the world

Sometimes when we learn something at school, it stays at school and has no impact on our everyday lives. This should not be the case with the study of statistics. As we learn about uncertainty and variation we start to see this in the world around us. When we learn about sampling and non-sampling errors, we become more critical of opinion polls and other research reported in the media. As we discover the power of statistical analysis and experimentation, we start to see the importance of evidence-based practice in medicine, social interventions and the like.

11. Statistics is an inherently interesting and relevant subject.

And it can be so much fun. There is a real excitement in exploring data, and becoming a detective. If you aren’t having fun, you aren’t doing it right!

12. Resources from Statistics Learning Centre will help you learn.

Of course!