About Dr Nic

I love to teach just about anything. My specialties are statistics and operations research. I have insider knowledge on Autism through my family. I have a lovely husband, two grown-up sons, a fabulous daughter-in-law and a new adorable grandson. I have four blogs - Learn and Teach Statistics, Never Ordinary Life, Chch Relief Society and StatsLC News.

Teaching Confidence Intervals

If you want your students to understand just two things about confidence intervals, what would they be?

What and what order

When making up a teaching plan for anything it is important to think about whom you are teaching, what it is you want them to learn, and what order will best achieve the most important desired outcomes. In my previous life as a university professor I mostly taught confidence intervals to business students, including MBAs. Currently I produce materials to help teach high school students. When teaching business students, I was aware that many of them had poor mathematics skills, and I did not wish that to get in the way of their understanding. High School students may well be more at home with formulas and calculations, but their understanding of the outside world is limited. Consequently the approaches for these two different students may differ.

Begin with the end in mind

I use the “all of the people, some of the time” principle when deciding on the approach to use in teaching a topic. Some of the students will understand most of the material, but most of the students will only really understand some of the material, at least the first time around. Statistics takes several attempts before you approach fluency. Generally the material students learn will be the material they get taught first, before they start to get lost. Therefore it is good to start with the important material. I wrote a post about this, suggesting starting at the very beginning is not always the best way to go. This is counter-intuitive to mathematics teachers who are often very logical and wish to take the students through from the beginning to the end.

At the start I asked this question – if you want your students to understand just two things about confidence intervals, what would they be?

To me the most important things to learn about confidence intervals are what they are and why they are needed. Learning about the formula is a long way down the list, especially in these days of computers.

The traditional approach to teaching confidence intervals

A traditional approach to teaching confidence intervals is to start with the concept of a sampling distribution, followed by calculating the confidence interval of a mean using the Z distribution. Then the t distribution is introduced. Many of the questions involve calculation by formula. Very little time is spent on what a confidence interval is and why we need them. This is the order used in many textbooks. The Khan Academy video that I reviewed in a previous post does just this.

A different approach to teaching confidence intervals

My approach is as follows:
Start with the idea of a sample and a population, and that we are using a sample to try to find out an unknown value from the population. Show our video about understanding a confidence interval. One comment on this video decried the lack of formulas. I’m not sure what formulas would satisfy the viewer, but as I was explaining what a confidence interval is, not how to get it, I had decided that formulas would not help.

The new New Zealand school curriculum follows a process to get to the use of formal confidence intervals. Previously the assessment was such that a student could pass the confidence interval section by putting values into formulas in a calculator. In the new approach, early high school students are given real data to play with, and are encouraged to suggest conclusions they might be able to draw about the population, based on the sample. Then in Year 12 they start to draw informal confidence intervals, based on the sample. This uses a simple formula for the confidence interval of a median and is shown in the following video:

Then in Year 13, we introduce bootstrapping as an intuitively appealing way to calculate confidence intervals. Students use existing data to draw a conclusion about two medians. This video goes through how this works and how to use iNZight to perform the calculations.

In a more traditional course, you could instead use the normal-based formula for the confidence interval of a mean. We now have a video for that as well.

You could then examine the idea of the sampling distribution and the central limit theorem.

The point is that you start with getting an idea of what a confidence interval is, and then you find out how to find one, and then you start to find out the theory underpinning it. You can think of it as successive refinement. Sometimes when we see photos downloading onto a device, they start off blurry, and then gradually become clearer as we gain more information. This is a way to learn a complex idea, such as confidence intervals. We start with the big picture, and not much detail, and then gradually fill out the details of the how and how come of the calculations.

When do we teach the formulas?

Some teachers believe that the students need to know the formulas in order to understand what is going on. This is probably true for some students, but not all. There are many kinds of understanding, and I prefer a conceptual and graphical approaches. If formulas are introduced at the end of the topic, then the students who like formulas are satisfied, and the others are not alienated. Sometimes it is best to leave the vegetables until last! (This is not a comment on the students!)

For more ideas about teaching confidence intervals see other posts:
Good, bad and wrong videos about confidence intervals
Confidence Intervals: informal, traditional, bootstrap
Why teach resampling

The silent dog – null results matter too!

Recently I was discussing the process we use in a statistical enquiry. The ideal is that we start with a problem and follow the statistical enquiry cycle through the steps Problem, Plan, Data collection, Analysis and Conclusion, which then may lead to other enquiries. We have recently published a video outlining this process.

I have also previously written a post suggesting that the cyclical nature of the process was overstated.

The context of our discussion was another video I am working on, that acknowledges that often we start, not at the beginning, but in the middle, with a set of data. This may be because in an educational setting it is too expensive and time consuming to require students to collect their own data. Or it may be that as statistical consultants we are brought into an investigation once the data has been collected, and are needed to make some sense out of it. Whatever the reason, it is common to start with the data, and then loop backwards to the Problem and Plan phases, before performing the analysis and writing the conclusions.

Looking for relationships

We, a group of statistical educators, were suggesting what we would do with a data set, which included looking at the level of measurement, the origins of the data, and the possible intentions of the people who collected it. One teacher suggests to her students that they do exploratory scatter plots of all the possible pairings, as well as comparative dotplots and boxplots. The students can then choose a problem that is likely to show a relationship – because they have already seen that there is a relationship in the data.

I have a bit of a problem with this. It is fine to get an overview of the relationships in the data – that is one of the beauties of statistical packages. And I can see that for an assignment, it is more rewarding for students to have a result they can discuss. If they get a null result there is a tendency to think that they have failed. Yet the lack of evidence of a relationship may be more important than evidence of one. The problem is that we value positive results over null results. This is a known problem in academic journals, and many words have been written about the problems of over-occurrence of type 1 errors, or publication bias. Let me illustrate. A drug manufacturer hopes that drug X is effective in treating depression. In reality drug X is no more effective than a placebo. The manufacturer keeps funding different tests by different scientists. If all the experiments use a significance level of 0.05, then about 5% of the experiments will produce a type 1 error and say that there is an effect attributable to drug X. The (false) positive results are able to be published, because academic journals prefer positive results to null-results. Conversely the much larger number of researchers who correctly concluded that there is no relationship, do not get published and the abundance of evidence to the contrary is invisible. To be fair, it is hoped that these researchers will be able to refute the false positive paper.

Let them see null results

So where does this leave us as teachers of statistics? Awareness is a good start. We need to show null effects and why they are important. For every example we give that ends up rejecting the null hypothesis, we need to have an example that does not. Text books tend to over-include results that reject the null, so that when a student meets a non-significant result he or she is left wondering whether they have made a mistake. In my preparation of learning materials, I endeavour to keep a good spread of results – strongly positive, weakly positive, inconclusive, weakly negative and strongly negative.  This way students are accepting of a null result, and know what to say when they get one.

Another example is in the teaching of time series analysis. We love to show series with strong seasonality. It tells a story. (see my post about time series analysis as storytelling.) Retail sales nearly all peak in December, and various goods have other peaks. Jewellery retail sales in the US has small peaks in February and May, and it is fun working out why. Seasonal patterns seem like magic. However, we need also to allow students to analyse data that does not have a strong seasonal pattern, so that they can learn that they also exist!

My final research project before leaving the world of academia involved an experiment on the students in my class of over 200. It was difficult to get through the human ethics committee, but made it in the end. The students were divided into two groups, and half were followed up by tutors weekly if they were not keeping up with assignments and testing. The other half were left to their own devices, as had previously been the case. The interesting result was that it made no difference to the pass rate of the students. In fact the proportion of passes was almost identical. This was a null result. I had supposed that following up and helping students to keep up would increase their chances of passing the course. But they didn’t. This important result saved us money in terms of tutor input in following years. Though it felt good to be helping our students more, it didn’t actually help them pass, so was not justifiable in straitened financial times.

I wonder if it would have made it into a journal.

By the way, my reference to the silent dog in the title is to the famous Sherlock Holmes story, Silver Blaze, where the fact that the dog did not bark was important as it showed that the person was known to it.

Teach students to learn to fish

There is a common saying that goes roughly, “Give a person a fish and you feed him for a day. Teach a person to fish and you feed her for a lifetime.”

Statistics education is all about teaching people to fish. In a topic on questionnaire design, we choose as our application the consumption of sugar drinks, the latest health evil. We get the students to design questionnaires to find out drinking habits. Clearly we don’t want to focus too much on the sugar drink aspect, as this is the context rather than the point of the learning. What we do want to focus on is the process, so that in future, students can transfer their experience writing a questionnaire about sugar drinks to designing a questionnaire about another topic, such as chocolate, or shoe-buying habits.

Questionnaire design is part of the New Zealand school curriculum, and the process includes a desk-check and a pilot survey. When the students are assessed, they must show the process they have gone through in order to produce the final questionnaire. The process is at least as important as the resulting questionnaire itself.

Here is our latest video, teaching the process of questionnaire design.

Examples help learning

Another important learning tool is the use of examples. When I am writing computer code, I usually search on the web or in the manual for a similar piece of code, and work out how it works and adapt it. When I am trying to make a graphic of something, I look around at other graphics, and see what works for me and what does not. I use what I have learned in developing my own graphics. Similarly when we are teaching questionnaire design, we should have examples of good questionnaires, and not so good questionnaires, so that students can see what they are aiming for. This is especially true for statistical report-writing, where a good example can be very helpful for students to see what is required.

Learning how to learn

But I’d like to take it a step further. Perhaps as well as teaching how to design a questionnaire, or write a report, we should be teaching how to learn how to design a questionnaire. This is a transferable skill to many areas of statistics and probability as well as operations research, mathematics, life… This is teaching people to be “life-long learners”, a popular catchphrase.

We could start the topic by asking, “How would you learn how to design a questionnaire?” then see what the students come up with. If I were trying to learn how to design a questionnaire, I would look at what the process might entail. I would think about the whole statistical process, thinking about similarities and differences. I would think about things that could go wrong in a questionnaire. I would also spend some time on the web, and particularly YouTube, looking at lessons on how to design a questionnaire. I would ask questions. I would look at good questionnaires. I would then try out my process, perhaps on a smaller problem. I would evaluate my process by looking at the end-result. I would think about what worked and what didn’t, and what I would do next time.

This gives us three layers of learning, Our students are learning how to write a questionnaire about sugar drinks, and the output from that is a questionnaire. They are also learning the general process of designing a questionnaire, that can be transferred to other questionnaire contexts. Then at the next level up, they are learning how to learn a process, in this case the process of designing a questionnaire. This skill can be transferred to learning other skills or processes, such as writing a time series report, or setting up an experiment or critiquing a statistical report.

Levels of learning in the statistics classroom

Levels of learning in the statistics classroom

I suspect that the top layer of learning how to learn is often neglected, but is a necessary skill for success at higher learning. We are keen as teachers to make sure that students have all the materials and experiences they need in order to learn processes and concepts. Maybe we need to think a bit more about giving students more opportunities to be consciously learning how to learn new processes and concepts.

We can liken it a little to learning history. When a class studies a certain period in history, there are important concepts and processes that they are also learning, as well as the specifics of that topic. In reality the topic is pretty much arbitrary, as it is the tool by which the students learn history skills, such as critical thinking, comparing, drawing parallels and summarising. In statistics the context, though hopefully interesting, is seldom important in itself. What matters is the concepts, skills and attitudes the student develops through the analysis. The higher level in history might be to learn how to learn about a new philosophical approach, whereas the higher level in statistics is learning how to learn a process.

The materials we provide at Statistics Learning Centre are mainly fishing lessons, with some examples of good and bad fish.  It would be great if we could also use them to develop students’ ability to learn new things, as well as to do statistics. Something to work towards!

Why I am going to ICOTS9 in Flagstaff, Arizona

I was a university academic for twenty years. One of the great perks of academia is the international conference. Thanks to the tax-payers of New Zealand I have visited Vancouver, Edinburgh, Melbourne (twice), San Diego, Fort Lauderdale, Salt Lake City and my favourite, Ljubljana. This is a very modest list compared with many of my colleagues, as I didn’t get full funding until the later years of my employ.

Academic conferences enable university researchers and teachers from all over the world to gather together and exchange ideas and contacts. They range from fun and interesting to mind-bogglingly boring. My first conference was IFORS in Vancouver in 1996, and I had a blast. It helped that my mentor, Hans Daellenbach, was also there, and I got to meet some of the big names in operations research. I have since attended two other IFORS conferences, and it is amazing how connected you can feel to people whom you meet only every few years. I always try to go to most sessions of the conference as I feel an obligation to the people who have paid to have me there. It is unethical to be paid to go to a conference, and then turn up only for a couple of sessions and the banquet. Sometimes sessions that I have only limited connection with can turn out to be interesting. I found I could always listen for the real world application that I could then include in my teaching. That would usually take up the first few minutes of the talk. Once the formulas appeared I would glaze over and go to my happy place. Having said that, I also think mental health breaks are important, and would take time out to reflect. I get more out of conferences if I leave my husband at home. The quiet time in my hotel room was also important for invigorating my teaching and research.

Most academic conferences focus on research, though they often have a teaching stream, which I frequent. ICOTS is different though as it is mostly about teaching, with a research stream! ICOTS stands for International Conference on Teaching Statistics, and runs every four years. I attended my first ICOTS in Slovenia in 2010. What surprised me was how many people there were from New Zealand! At the welcome reception I wandered around introducing myself to people and more often than not found they were also from New Zealand. How ironic to spend 40 hours getting to this amazing place and meet large numbers of fellow kiwis! (Named for the bird, not the fruit!). Ljubljana is a wonderful city, with fantastic architecture and lots of bike routes and geocaches. I made good use of my spare time. The conference itself was inspiring too. I attended just about every session, and gave a paper about making videos to teach statistics. I saw the dance of the p-value, and learned about statistics teaching in some African countries. I was impressed by the keynote by Gerd Gigerenzer, and went home and cancelled my mammogram. I put faces to some of the names in statistics education, though I was sad not to see George Cobb there, or Joan Garfield. What struck me was how nice everyone was. I loved my trip to some caves on the half-day excursion.

The point of this post is to encourage readers to go to ICOTS 9 in July this year. I admit I was a little disappointed when they announced the venue. I was hoping for somewhere a little more exotic. However the great benefit is that it is going to cost considerably less to get there than to many countries, and take less time. (For people from New Zealand and Australia, a trip of less than 24 hours is a bonus.) Now that I am no longer paid by a university to go to conferences, the cost is a big consideration. If necessary I will sell our caravan. Another benefit of the venue is it is very convenient for teachers from the US to attend. I am hoping to find out more about AP statistics, and other US statistics teaching.

I am currently reviewing an edited book published by Springer, Probabilistic Thinking. As I read each chapter I am increasingly excited that most of the authors will be attending ICOTS9. This is a great opportunity to discuss with them their ideas, and how to apply them in the classroom and in our resources. I am particularly interested in the latest research on how children and adults learn statistics and probability. This ICOTS I am doing a presentation about setting up a blog, Twitter and YouTube. In four years’ time I hope to be able to add to the research using what we have learned from students’ responses on our on-line resources.

I am a little apprehensive about the altitude and temperature, but have planned to arrive a few days early in Phoenix to acclimatise myself. In the interests of economy I will be staying at the university dorms, and just found out there is no air-conditioning in the bedrooms. My daughter-in-law from Utah tells me to buy a fan. I’m pretty happy about a trip to the Grand Canyon on the afternoon off.  The names of presenters and their abstracts are now available on the ICOTS9 website, so you can see what interesting times await.

I really hope I see a lot of you there – and not just New Zealanders.

 

The Myth of Random Sampling

I feel a slight quiver of trepidation as I begin this post – a little like the boy who pointed out that the emperor has  no clothes.

Random sampling is a myth. Practical researchers know this and deal with it. Theoretical statisticians live in a theoretical world where random sampling is possible and ubiquitous – which is just as well really. But teachers of statistics live in a strange half-real-half-theoretical world, where no one likes to point out that real-life samples are seldom random.

The problem in general

In order for most inferential statistical conclusions to be valid, the sample we are using must obey certain rules. In particular, each member of the population must have equal possibility of being chosen. In this way we reduce the opportunity for systematic error, or bias. When a truly random sample is taken, it is almost miraculous how well we can make conclusions about the source population, with even a modest sample of a thousand. On a side note, if the general population understood this, and the opportunity for bias and corruption were eliminated, general elections and referenda could be done at much less cost,  through taking a good random sample.

However! It is actually quite difficult to take a random sample of people. Random sampling is doable in biology, I suspect, where seeds or plots of land can be chosen at random. It is also fairly possible in manufacturing processes. Medical research relies on the use of a random sample, though it is seldom of the total population. Really it is more about randomisation, which can be used to support causal claims.

But the area of most interest to most people is people. We actually want to know about how people function, what they think, their economic activity, sport and many other areas. People find people interesting. To get a really good sample of people takes a lot of time and money, and is outside the reach of many researchers. In my own PhD research I approximated a random sample by taking a stratified, cluster semi-random almost convenience sample. I chose representative schools of different types throughout three diverse regions in New Zealand. At each school I asked all the students in a class at each of three year levels. The classes were meant to be randomly selected, but in fact were sometimes just the class that happened to have a teacher away, as my questionnaire was seen as a good way to keep them quiet. Was my data of any worth? I believe so, of course. Was it random? Nope.

Problems people have in getting a good sample include cost, time and also response rate. Much of the data that is cited in papers is far from random.

The problem in teaching

The wonderful thing about teaching statistics is that we can actually collect real data and do analysis on it, and get a feel for the detective nature of the discipline. The problem with sampling is that we seldom have access to truly random data. By random I am not meaning just simple random sampling, the least simple method! Even cluster, systematic and stratified sampling can be a challenge in a classroom setting. And sometimes if we think too hard we realise that what we have is actually a population, and not a sample at all.

It is a great experience for students to collect their own data. They can write a questionnaire and find out all sorts of interesting things, through their own trial and error. But mostly students do not have access to enough subjects to take a random sample. Even if we go to secondary sources, the data is seldom random, and the students do not get the opportunity to take the sample. It would be a pity not to use some interesting data, just because the collection method was dubious (or even realistic). At the same time we do not want students to think that seriously dodgy data has the same value as a carefully collected random sample.

Possible solutions

These are more suggestions than solutions, but the essence is to do the best you can and make sure the students learn to be critical of their own methods.

Teach the best way, pretend and look for potential problems.

Teach the ideal and also teach the reality. Teach about the different ways of taking random samples. Use my video if you like!

Get students to think about the pros and cons of each method, and where problems could arise. Also get them to think about the kinds of data they are using in their exercises, and what biases they may have.

We also need to teach that, used judiciously, a convenience sample can still be of value. For example I have collected data from students in my class about how far they live from university , and whether or not they have a car. This data is not a random sample of any population. However, it is still reasonable to suggest that it may represent all the students at the university – or maybe just the first year students. It possibly represents students in the years preceding and following my sample, unless something has happened to change the landscape. It has worth in terms of inference. Realistically, I am never going to take a truly random sample of all university students, so this may be the most suitable data I ever get.  I have no doubt that it is better than no information.

All questions are not of equal worth. Knowing whether students who own cars live further from university, in general, is interesting but not of great importance. Were I to be researching topics of great importance, such safety features in roads or medicine, I would have a greater need for rigorous sampling.

So generally, I see no harm in pretending. I use the data collected from my class, and I say that we will pretend that it comes from a representative random sample. We talk about why it isn’t, but then we move on. It is still interesting data, it is real and it is there. When we write up analysis we include critical comments with provisos on how the sample may have possible bias.

What is important is for students to experience the excitement of discovering real effects (or lack thereof) in real data. What is important is for students to be critical of these discoveries, through understanding the limitations of the data collection process. Consequently I see no harm in using non-random, realistic sampled real data, with a healthy dose of scepticism.

Statistics – Singular and Plural, Lies and Truth

Language is an issue in teaching and learning statistics. There are many words that have meanings in statistics, different from their everyday meaning, and even with multiple meanings within the study of statistics. Examples of troublesome words are: error, correlation, regression, significant, model. I wrote about addressing this in Teaching Statistical Language.

But the problem starts even with the name of the subject. There are at least three meanings for the term “statistics”. The word is not even consistently singular or plural. I suggest three meanings are: Data (plural), analysis (singular) and information (plural). What we teach focusses on the analysis, but involves data and information.

Statistics as Data

Sports people love statistics. Game shows and pub quizzes draw on data such as numbers of Olympic medals, wives, years of warfare, Oscars and a myriad other subjects. These statistics can be fascinating, relevant, boring or trivial. My most read blog post is entitled “Khan Academy Statistics videos are not good”. I suspect that quite a few people are searching for statistics about Khan Academy, rather than the subject of my post. This is borne out by the fact that a more recent post:  “Open Letter to Khan Academy about Basic Probability” gets considerably less traffic. I suppose there are not many people who want to know about the probability of Khan Academy. Pity – as the second post is better.

There is an entire discipline around “Official Statistics”. At a recent conference (ORSNZ/NZSA) I was fascinated by a presentation given about the need for statistics in a time of disaster and recovery. John Créquer talked about a subject close to my heart, the Christchurch earthquakes. In the weeks and months of the earthquakes authorities needed information of how many people there were of high need, in order to provide adequate service. Finding these numbers was an exercise in ingenuity and co-operation, drawing on data collected for other purposes. The presenter suggested that at times like that a national register would be invaluable. New Zealand does not have such a thing. It is an interesting conflict between the need for privacy and the public good. Créquer is a statistician from Statistics New Zealand, who has been contracted to CERA (The Canterbury Earthquake Recovery Authority) for now.  I had never thought that a statistician had uniquely valuable skills and insights to be used in a time of recovery from disaster.

The internet is an amazing source of the data kind of statistics. You can find out the number of an awful lot of things, simply by putting the question in a search box, or looking on Wikipedia. (I’ve made my annual monetary contribution – have you?). Thanks to Wikipedia, we don’t need to wonder about trivial things anywhere near as much as we used to.

Statistics as Analysis

Statistics, as it is taught and learned as a subject, mostly refers to statistical analysis and the inquiry process in which it is embedded. I sometimes wonder what people are thinking when I say that I produce materials to help people learn statistics. Do they imagine a classful of students memorising the populations of countries and batting averages?

“It is easy to lie with statistics. It is hard to tell the truth without it.”

This quote is from Andrejs Dunkels, a person whom I wish I had met. When I was looking for the source of this quote, I found a tribute page to a man who contributed greatly to the world of statistics. His quote uses statistics as a singular noun.

The analysis aspect of statistics involves taking raw data and turning it into information and evidence of what may be truth. Science would not progress far without the tools of statistics to take the raw results of experiments and observations, and using the insights gained by the mathematical world of probability, discern their significance. Without the discoveries and tools of statistics we would not be able to make sensible inference about populations from samples and experiments.

Statistical analysis uses mathematical tools, but is far more than just the mathematics. It is easy to produce wrong information by using the mechanistic calculations without thinking critically about the results. I once produced some very wrong models of performance of bank branches, using multiple regression. I even came up with some interesting rationalisations for the counter-intuitive results. Then I did a residual plot and found one outlier that changed everything! Once I removed it, the models changed to the extent that some of the coefficients changed sign. I wonder how many wrong models persist because of well-intentioned, but unskilled analysts.

There is a wonderful paragraph I used to quote in my second year statistical methods class, that unfortunately I can’t find – even using Wikipedia. It says, in essence: Statistical models are not sausage machines, taking in data and turning it into information without the interference of a human. If the results do not make sense and align with common understanding of the phenomenon, they are probably wrong.

If someone can direct me to the actual quote, I’d be very happy. I used to get the class to recite it in unison.

The point I am making is that the second meaning of statistics is a combination of science and art. It needs people.

Statistics as Information

This is similar to the first meaning, but I think that processed data should have a home separate from raw data. Statistical results include relationships and differences, not just “the facts.” I would put graphs and tables into this category. I think this category is scarier than statistics as data. Everyone can understand that Henry the Eight had six wives, and New Zealand won six gold medals at the London Olympics. Those are non-scary statistics, and easily accessible. They are statistics as data or facts.

What is more daunting to many people is the results of analysis. This is where we try to explain the population effect of cancer screening, the significance (statistical) of an increase or decrease in birthrate, the effect of seasonality on the sales of jewellery in the USA, the evidence that increasing numbers of cows are causing a degradation of water quality in natural water sources. These statistics need to be well presented. Part of our role as teachers is to help future producers of such information to be able to express themselves well so these statistics are accessible. Another part of our role is help future consumers of statistics to understand them.

Our role is important – for all three types of statistics.

Deterministic and Probabilistic models and thinking

The way we understand and make sense of variation in the world affects decisions we make.

Part of understanding variation is understanding the difference between deterministic and probabilistic (stochastic) models. The NZ curriculum specifies the following learning outcome: “Selects and uses appropriate methods to investigate probability situations including experiments, simulations, and theoretical probability, distinguishing between deterministic and probabilistic models.” This is at level 8 of the curriculum, the highest level of secondary schooling. Deterministic and probabilistic models are not familiar to all teachers of mathematics and statistics, so I’m writing about it today.

Model

The term, model, is itself challenging. There are many ways to use the word, two of which are particularly relevant for this discussion. The first meaning is “mathematical model, as a decision-making tool”. This is the one I am familiar with from years of teaching Operations Research. The second way is “way of thinking or representing an idea”. Or something like that. It seems to come from psychology.

When teaching mathematical models in entry level operations research/management science we would spend some time clarifying what we mean by a model. I have written about this in the post, “All models are wrong.”

In a simple, concrete incarnation, a model is a representation of another object. A simple example is that of a model car or a Lego model of a house. There are aspects of the model that are the same as the original, such as the shape and ability to move or not. But many aspects of the real-life object are missing in the model. The car does not have an internal combustion engine, and the house has no soft-furnishings. (And very bumpy floors). There is little purpose for either of these models, except entertainment and the joy of creation or ownership. (You might be interested in the following video of the Lego Parisian restaurant, which I am coveting. Funny way to say Parisian!)

Many models perform useful functions. My husband works as a land-surveyor, and his work involves making models on paper or in the computer, of phenomenon on the land, and making sure that specified marks on the model correspond to the marks placed in the ground. The purpose of the model relates to ownership and making sure the sewers run in the right direction. (As a result of several years of earthquakes in Christchurch, his models are less deterministic than they used to be, and unfortunately many of our sewers ended up running the wrong way.)

Our world is full of models:

  • a map is a model of a location, which can help us get from place to place.
  • sheet music is a written model of the sound which can make a song
  • a bus timetable is a model of where buses should appear
  • a company’s financial reports are a model of one aspect of the company

Deterministic models

A deterministic model assumes certainty in all aspects. Examples of deterministic models are timetables, pricing structures, a linear programming model, the economic order quantity model, maps, accounting.

Probabilistic or stochastic models

Most models really should be stochastic or probabilistic rather than deterministic, but this is often too complicated to implement. Representing uncertainty is fraught. Some more common stochastic models are queueing models, markov chains, and most simulations.

For example when planning a school formal, there are some elements of the model that are deterministic and some that are probabilistic. The cost to hire the venue is deterministic, but the number of students who will come is probabilistic. A GPS unit uses a deterministic model to decide on the most suitable route and gives a predicted arrival time. However we know that the actual arrival time is contingent upon all sorts of aspects including road, driver, traffic and weather conditions.

Model as a way of thinking about something

The term “model” is also used to describe the way that people make sense out of their world. Some people have a more deterministic world model than others, contributed to by age, culture, religion, life experience and education. People ascribe meaning to anything from star patterns, tea leaves and moon phases to ease in finding a parking spot and not being in a certain place when a coconut falls. This is a way of turning a probabilistic world into a more deterministic and more meaningful world. Some people are happy with a probabilistic world, where things really do have a high degree of randomness. But often we are less happy when the randomness goes against us. (I find it interesting that farmers hit with bad fortune such as a snowfall or drought are happy to ask for government help, yet when there is a bumper crop, I don’t see them offering to give back some of their windfall voluntarily.)

Let us say the All Blacks win a rugby game against Australia. There are several ways we can draw meaning from this. If we are of a deterministic frame of mind, we might say that the All Blacks won because they are the best rugby team in the world.  We have assigned cause and effect to the outcome. Or we could take a more probabilistic view of it, deciding that the probability that they would win was about 70%, and that on the day they were fortunate.  Or, if we were Australian, we might say that the Australian team was far better and it was just a 1 in 100 chance that the All Blacks would win.

I developed the following scenarios for discussion in a classroom. The students can put them in order or categories according to their own criteria. After discussing their results, we could then talk about a deterministic and a probabilistic meaning for each of the scenarios.

  1. The All Blacks won the Rugby World Cup.
  2. Eri did better on a test after getting tuition.
  3. Holly was diagnosed with cancer, had a religious experience and the cancer was gone.
  4. A pet was given a homeopathic remedy and got better.
  5. Bill won $20 million in Lotto.
  6. You got five out of five right in a true/false quiz.

The regular mathematics teacher is now a long way from his or her comfort zone. The numbers have gone, along with the red tick, and there are no correct answers. This is an important aspect of understanding probability – that many things are the result of randomness. But with this idea we are pulling mathematics teachers into unfamiliar territory. Social studies, science and English teachers have had to deal with the murky area of feelings, values and ethics forever.  In terms of preparing students for a random world, I think it is territory worth spending some time in. And it might just help them find mathematics/statistics relevant!

Guest Post: Risk, Insurance and the Actuary

Risk, Insurance, and the Actuary

Risk is an inherent part of our daily life. As a result, most of us, take out insurance policies as a means of protection against scenarios which, were they to occur, may cause hardship whether for us or, as in the case of life insurance, for our families.

Insurance companies write many types of policies. The mutual risks of the policy holders are shared so that claims made against the policies can be covered at a much reduced cost. If priced fairly, then the premium reflects the contribution of the insured’s risk to overall risk.

As policy holders – we want the best price to cover the risk we are offloading; shareholders (again often us if we have superannuation)of the insurance company –require the premiums be sufficient to ensure the company stays in business.

It is then very important that analysts pricing the policies (and those calculating the required level of capital to meet the claim liabilities) have the statistical knowledge necessary to measure risk accurately! Understanding risk is even more critical in the framework of Solvency II (*) capital requirements (if it ever gets enforced).

The task is made more difficult as the duration of the policy life varies considerably. Some insurance cover is claimed against shortly after the incident occurs with a short processing time – automobile accidents for instance typically fit this category. This class of cover is termed short-tail liabilities as payments are completed within a short timeframe of the incident occurring.

Other cases arise many years after the original policy was taken out, or payments may occur many years after the original claim was raised – for example medical malpractice. These are termed long-tail liabilities as payments may be made long after the original policy was activated or the incident occurred. Due to the long forecast horizon and [generally] higher volatility in the claim amounts, long-tail liabilities are inherently more risky.

Life insurance is in its own category as everybody dies sometime.

Meet the data

For convenience, and because it is generally less well understood, we restrict our focus to long-tail liability insurance data

For each claim we have many attributes, but four that are universal to all claims: payment amount(s), incident date (when the originating event resulting in the claim occurred), payment date(s), and state of claim (are further payments possible or is the claim settled). These attributes allow the aggregation of the individual claim data into a series more amenable for analysis at the financial statement level where the volatility of individual claims should be largely eliminated since the risk is pooled.

Actuaries tend to present their data cumulatively in a table like this:

Actuarial tableWhere the rows are accident years, and the column index (development time in actuarial parlance) is the delay between the accident year and the year of payment.

Thus payments made in development lag 0 corresponds to all payments made toward claims in the year the accident occurred. The values in development lag 10 correspond to the sum of the payments made in the eleven years since the accident occurred.

This presentation likely arose for a number of reasons, but the most important two being:

  • Cumulative data are much easier to work with in the absence of computers;
  • Volatility is visibly less of an issue the further in the development tail when examining cumulatives.

The nature of the inherited data presentation produces some unfortunate consequences:

  • Variability is hard to quantify between parameter uncertainty and process volatility;
  • Calendar year effects (trends down the diagonals) are unable to be measured – and therefore readily predicted;
  • Parameter interpretation is difficult due to the calendar year confounding effects; and
  • Parsimony is hard to achieve.

The actuarial profession attempts to deal with each of these issues in various ways. For instance, the bootstrap is being used to quantify variability. Data may be indexed against inflation to partially account for calendar year trends.

Why spend time on this?

Fundamentally because, if you want to solve a problem, you first have to be sure that the data you are using and the way you are using it allows you to solve the problem! The profession has spent much time, energy, and analysis on developing techniques to solve the risk measurement problem but with the underlying assumption that cumulation is the way to analyse insurance data.

Aside: this is why I enjoy Genetic Programming – not because the algorithm allows the automatic generation of solutions, but rather because you have to formulate the problem very precisely in order to ensure the right problem is solved.

Understanding the problem

The objective of analysis of the Insurance portfolios is to quantify the expected losses incurred by the Insurance company and the volatility (the risk) associated with the portfolio so adequate money is raised to pay all liabilities, at a reasonable price, with an excellent profit. Additional benefits may arise like an improved understanding of the policies being written, targeting of more profitable customers, and so forth, but these are secondary.

Assume the data available are the loss data with the three attributes of accident time, calendar time, and payment. Forget about claim state for now though this is an important factor for future projections.

We immediately identify two time attributes. This suggests time series models are likely a good starting point for analysis. We also would examine the distribution(s) of incremental losses rather than cumulate the losses over time since cumulation of time series would hide the volatility of the losses at the individual time points – the very component that we are interested in.

Further, we need the ability to distinguish between parameters, parameter uncertainty, and the process volatility. Process volatility and parameter uncertainty drive the critical risk metrics which are essential to ensuring adequate capital is set aside to not only cover the expected losses, but also allow for the unexpected losses should they occur.

Beginning with this foundation, modelling techniques which take the fundamental time-series nature of the data into account are almost certain to provide superior performance to methodologies which mask (for historical reasons mentioned) the time series nature of the data.

Is this new?

Actually, no. All the above considerations of analysis of P&C insurance data were presented many years ago. However, time series approaches are not typically taught to aspiring P&C actuaries. Why?

Perhaps several reasons:

  • Tradition. Like any specialised profession, a system is developed to provide solutions and unless the system is convincingly broken, the uptake of new methodology is resisted.
  • Statistical analysis is complicated. Applying standard formula to get answers is “easy” when you know the formula.

The catch

Misrepresenting data leads to a flawed model representing the underlying data processes.

The likelihood of such a methodology resulting in the correct mean or a correct measure of the volatility is extremely low. The distributional assumptions are likely completely spurious as the fundamental nature of the data is not recognised.

Wrong model = wrong conclusion, unless you’re unlucky

It is often a general problem where the wrong statistical technique is applied to solve a statistical problem. This suggestion the statement: “All models are wrong, but some are useful.” This is not entirely fair in my mind as it (wrongly) places the blame on the model where the blame should actually be on the analyst and their choice of the modelling method.

Although we will never find the model driving the underlying data generating process, nevertheless, we can often well approximate the data process (otherwise modelling of any kind would be pointless). These are the useful models. Then you are only unlucky if your model looks like it is useful, but fails when it comes to prediction.

In summary

  • The problem of quantifying risk is not a simple exercise
  • Insurance data is fundamentally financial time series data
  • The right starting point is critical to any statistical analysis
  • We statisticians need to explain our solutions in a way that is meaningful to established professions

(*) In essence, Solvency II comprises insurance legislation aiming to improve policyholder protection by introducing a clear, comprehensive framework for a market consistent, risk model. In particular, insurance companies must be able to withstand a 1/200 year loss event in the next calendar year encompassing all levels of risk sources – insurance and reserve risk, catastrophe risk, operational risk, default risk to name a few.  Quantitative impact study documents are available here; a general discussion of Solvency II can be found here. The legislation has been postponed many times.

About David Munroe

David Munroe leads Insureware’s outstanding statistical department. Comments in this article are the authors own and do not necessarily represent the position of Insureware Pty Ltd.

He completed an Masters degree in Statistics (with First Class Honours) from Massey University, New Zealand.

David has experience in statistical and actuarial analysis along with C++ programming knowledge. Previous projects include working with a Canadian Insurance company to software training and implementation purposes resulting in significant modelling improvements (regions can be modelled within a working day allowing analysts to focus on providing extracted insights to management).

David studied the art of Shaolin Kempo for over nine years, holds a second degree black belt, and is qualified in the use of Okinawan weaponry. He is also interested in music (piano), literature, photography, and self sufficiency. He also has two children on the autism spectrum.

Analysis of “Deal or No Deal” results

Deal or No Deal

My son, Jonathan, loves game-shows, and his current favourite is Deal or No Deal, the Australian version. It has been airing now for over ten years, and there is at least one episode available every weeknight on New Zealand television. I often watch it with him as it is a nice time to spend together. We discuss whether people should take the deal or not, and guess what the bank offer will be. There are other followers of the programme, equally devoted, and I am grateful to Paul Corfiatis and his mum who fastidiously collected data for all the 215 programmes in 2009 on the final takings, the case chosen and the case containing the $200,000. In this post I analyse this data, and give some ideas of how this can be used in teaching.

Deal or No Deal, explained

You can find out ALL about Deal or No Deal on Wikipedia. I was excited to see our New Zealand radio gameshow, “The Money or the Bag” given as an antecedent.  There are numerous incarnations of the game. The basic idea is that there are 26 cases, containing a range of money values from 50c to $200,000. The money values are randomly assigned and their allocation unknown to the contestant and the “banker”. The contestant chooses one of the cases, and chats to the host, Andrew O’Keefe, about what they will do with the money when they win. The usual responses are to have a big wedding or travel. As the programme is filmed in Melbourne, often second generation Australians are wanting to visit their parents’ homeland.   Usually the contestant has a friend or family member as a podium player, who interacts as part of the banter. In the first round, the player chooses six cases to open, thus gaining information about the possible value in their case. At the end of the round, the banker offers a sum of money to buy back the case from the contestant, who must choose, “Deal” (take the money) or “No Deal”, keep the case and its contents. In the second round five cases are opened and then there is another bank offer. This continues until the sixth round, and from then the cases are opened one at a time, with an offer made after each one. The player either takes the deal at some point, or holds out until the end, at which point they take the contents of the case. There are other variants on this basic game, to add variety.

Human aspects

My son is blind and has autism, and finds much to like about this programme. He likes the order of it all – every night, a very similar drama is played out, and he can understand exactly what is happening. He also likes the agony and the joy. He gets very excited when the case containing $200,000 is opened with the special sound effect, and Andrew says, “Oh No”. He likes hearing about the people, and their lives and he likes that you never know how much you might win.

I also like the drama and the joy, but I’d rather not watch when it is going badly. I like it because it is an insight into people’s perceptions of chance. Like many people, I yell at the screen, telling them to take the deal when we see them being reckless, but I am usually happy when their foolish decisions turn out well.  To me it is a true reality show – not because the situation is in any way like reality, but because the people are authentic in their responses. I have been known to weep when a nice person wins a sizeable amount of money. One day I would love to go on the show, as I know how much joy that would bring Jonathan, to be a part of it.

Part of the appeal is the collective experience of it all. The podium players, the audience and the people at home feel connected to the main contestant. One episode that Jonathan loves to tell people about is with Josh Sharpe who was REALLY unlucky. You can see that here on YouTube:

The probability

The probability calculation for Deal or No Deal is very simple. The contestant has one chance in twenty-six that their case contains the big prize. They have four chances in twenty-six that their case contains a prize of $50,000 or more. The expected value of their prize, if they hold onto their case to the end, is about $19,900 (valuing the car at $30,000). When the dealer makes an offer, it is often around the expected value of the remaining unopened cases. (The average amount left.) There are times when the offer is considerably lower or higher than the expected value, which seems to be in an effort to push the contestant one way or the other. Contestants very seldom take the deal in the early rounds of the game.

There are a number of interesting questions we can explore:

  • What is the distribution of the actual outcomes for contestants?
  • How often do contestants do better than what is in their case?
  • Are there any “lucky” cases that contain the big prize more often than others?

To explore these questions I am using the data so diligently collected by Paul Corfiatis. I will use data from games with the regular list of prizes, not “Fantastic Four”, which has some more high value cases.

What is the actual outcome for the contestants?

The following graph shows the amount of money the contestants win, either by taking the deal or hanging out for the case.

Deal_or_No_Deal_outcomes

You could have an interesting discussion about the factors to account for in looking at this. You would expect the mean to be lower for the “case” prizes, as they tend to be people who have kept going to the bitter end. There is a very large standard deviation.

Here is a table of results:

Case Deal Either case or deal
Number of instances 53 146 199
Mean $6139 $21,044 $17,075
Median $500 $18,350 $15,000
Standard Deviation $13,740 $14,499 $15,721
Minimum $0.50 $950 $0.50
Maximum $50,000 $100,000 $100,000

How often do contestants do better than what is in their case?

For this I calculated the prize less the amount that was in their case. The mean value was $1082, with a median of $9969.50, minimum of -$170,050 and a maximum of $99,995. Contestants who took the deal, did better, 106 times out of 146, or 73% of the time.

Lucky Cases

And of course the one to make the statisticians smile – are there any lucky cases?

Here is a graph of the distribution of cases that held the $200,000. I am tempted to make glib comments about how clearly 14 is a lucky case, so you should pick that one, but then, maybe you should pick 19, as it hasn’t had the $200,000 much. But as you never know who is going to quote you, I’d better not.

Which case contained the $200,000 in 2007.

Which case contained the $200,000 in 2007.

Educational use for this

Depending on how much you wish to torment your students, and the educational objectives, you could give them the raw data, as provided on the site, and see what they come up with.  Or you could simply present the results given in this post, watch an episode, and discuss what meanings people could take from the data, and what misconceptions might occur.

About blogging

This is the 100th post on “Learn and Teach Statistics and Operations Research”. To celebrate, I am writing about the joys of blogging.

Anyone with an internet connection can blog these days, and do! It is the procrastinator’s “dark playground” to read blogs on pretty much anything you want to know. (For an explanation, with pictures, of the dark playground, where the instant gratification monkey holds sway until the panic monster arrives, see this entertaining post: Why Procrastinators Procrastinate.)

I started to blog to build a reputation for knowing about teaching statistics and operations research. This would lead people to buy our apps, subscribe to our on-line materials and watch my YouTube videos. Many blogs are set up, like this, in order to build credibility and presence on the internet. I’ve found it quite exciting to watch the readership grow, and I particularly love it when people comment. I also like to feel that I am doing some good in the world. The process of writing is also a learning process for me.

Here are some lightly structured thoughts about what I’ve learned over the last 99 posts.

A blog is not a scholarly research paper

As I come from an academic background, I have had to remind myself that a blog is different from a scholarly research paper. A blog isn’t scholarly, it isn’t based on research (unless you can call time in the shower that) and it isn’t on paper.

Blogging rewards bad behaviour.

The more opinionated you are, and the less evidence you use to support your argument, the more readers you get.  You must remove equivocation. Often after I write my first draft, I go through and remove statements like “in my opinion” or  “it seems”.  This is the antithesis of a scholarly paper, which must be carefully stated in balanced and measured tones.

Blogs are personal

It is good to be personal in a blog. In journal articles we avoid the use of first person language as if the paper were somehow written by itself. This can give rise to convoluted sentence structures and endless passive voice. When I write my blog, I talk about my own ideas, and even aspects of my life. I mention side tracks, and give a little bit of myself. And I prefer to read blogs that have a bit of the author in them. I think you need a little touch of narcissism to enjoy blogging.

Quantity is more important than quality

Volume in blogging dominates quality. Some might argue that this is also true for academic papers. In a blog you are better to dash off one opinion piece a week, than put the same effort into one scholarly paper. If one falls flat, it really doesn’t matter.

Blogs give instant gratification

Blogs have a quick turn-around, ideal for people with short attention spans who want instant gratification. In academia the delay between doing the research and seeing it in print is measured in years. By the time an article has been through the review process, you have almost forgotten why you did the research in the first place. And don’t really care anymore. But when you blog and click “Publish”, it is out there in the world for all to see.

People read blogs

People read your blog. It is an amazing feeling to send out my thoughts into the world and watch the viewing stats on WordPress, knowing that hundreds and sometimes even thousands of people are reading my opinion, literally all over the world. And sometimes I even get emails from fans, telling how my post has helped them or inspired them to work or do research in the area of statistics education. Or else I find that an educational institution has set a link to one of my posts for their students to read. In contrast I wonder if anyone has ever read my journal articles, apart from the reviewers. Not only do people read your blog, but you can see where they live and what they read, and even what search engine terms brought you to the blog. Some search terms boggle the mind, first that someone entered them, and secondly that they led to my blog! The term “rocks” has led to my site 66 times in the last two years, which I am sure was disappointing for the searcher. The most common search term is “causation”.

Blogging does not get you promoted

Though blogging is fun and great for attention-seekers, it does not improve your PBRF ratings (in NZ) or whatever the measure of publication activity is in a specific country. Nor does blogging count for promotion or tenure. This may be simply a matter of time to allow attitudes to change, as erudite blogs can get scientific findings out into the public domain far more rapidly than the old print-based system.

People can be mean

A blogger needs to have a thick skin. I don’t yet, and have to remind myself that I didn’t research my article, so it is only fair for people to offer opposing views. In fact, one the great qualities of a blog is that anyone can respond and improve the quality of the blog. I love it when people leave comments; it is the emailed “hate-messages” that are a bit upsetting.

Keynote speaker

One spin off of a successful blog is that you get asked to be a keynote speaker.

Actually I’m kidding on that one. I’d love to be a keynote speaker, and I’m pretty sure I could entertain a crowd and give them something to think about for an hour or so, but it hasn’t happened. Yet. Any invitations?