About Dr Nic

I love to teach just about anything. My specialties are statistics and operations research. I have insider knowledge on Autism through my family. I have a lovely husband, two grown-up sons, a fabulous daughter-in-law and a new adorable grandson. I have four blogs - Learn and Teach Statistics, Never Ordinary Life, Chch Relief Society and StatsLC News.

Nominal, Ordinal, Interval, Schmordinal

Everyone wants to learn about ordinal data!

I have a video channel with about 40 videos about statistics, and I love watching to see which videos are getting the most viewing each day. As the Fall term has recently started in the northern hemisphere, the most popular video over the last month is “Types of Data: Nominal, Ordinal, Interval/Ratio.” Similarly one of the most consistently viewed posts in this blog is one I wrote over a year ago, entitled, “Oh Ordinal Data, what do we do with you?”. Understanding about the different levels of data, and what we do with them, is obviously an important introductory topic in many statistical courses. In this post I’m going to look at why this is, as it may prove useful to learner and teacher alike.

And I’m happy to announce the launch of our new Snack-size course: Types of Data. For $2.50US, anyone can sign up and get access to video, notes, quizzes and activities that will help them, in about an hour, gain a thorough understanding of types of data.

Costing no more than a box of popcorn, our snack-size course will help help you learn all you need to know about types of data.

Costing no more than a box of popcorn, our snack-size course will help help you learn all you need to know about types of data.

The Big Deal

Data is essential to statistical analysis. Without data there is no investigative process. Data can be generated through experiments, through observational studies, or dug out from historic sources. I get quite excited at the thought of the wonderful insights that good statistical analysis can produce, and the stories it can tell. A new database to play with is like Christmas morning!

But all data is not the same. We need to categorise the data to decide what to do with it for analysis, and what graphs are most appropriate. There are many good and not-so-good statistical tools available, thanks to the wonders of computer power, but they need to be driven by someone with some idea of what is sensible or meaningful.

A video that becomes popular later in the semester is entitled, “Choosing the test”. This video gives a procedure for deciding which of seven common statistical tests is most appropriate for a given analysis. It lists three things to think about – the level of data, the number of samples, and the purpose of the analysis. We developed this procedure over several years with introductory quantitative methods students. A more sophisticated approach may be necessary at higher levels, but for a terminal course in statistics, this helped students to put their new learning into a structure. Being able to discern what level of data is involved is pivotal to deciding on the appropriate test.

Categorical Data

In many textbooks and courses, the types of data are split into two – categorical and measurement. Most state that nominal and ordinal data are categorical. With categorical data we can only count the responses to a category, rather than collect up values that are measurements or counts themselves. Examples of categorical data are colour of car, ethnicity, choice of vegetable, or type of chocolate.

With Nominal data, we report frequencies or percentages, and display our data with a bar chart, or occasionally a pie chart. We can’t find a mean of nominal data. However if the different responses are coded as numbers for ease of use in a database, it is technically possible to calculate the mean and standard deviation of those numbers. A novice analyst may do so and produce nonsense output.

The very first data most children will deal with is nominal data. They collect counts of objects and draw pictograms or bar charts of them. They ask questions such as “How many children have a cat at home?” or “Do more boys than girls like Lego as their favourite toy?” In each of these cases the data is nominal, probably collected by a survey asking questions like “What pets do you have?” and “What is your favourite toy?”

Ordinal data

Another category of data is ordinal, and this is the one that causes the most problems in understanding. My blog discusses this. Ordinal data has order, and numbers assigned to responses are meaningful, in that each level is “more” than the previous level. We are frequently exposed to ordinal data in opinion polls, asking whether we strongly disagree, disagree, agree or strongly agree with something. It would be acceptable to put the responses in the opposite order, but it would have been confusing to list them in alphabetical order: agree, disagree, strongly agree, strongly disagree. What stops ordinal data from being measurement data is that we can’t be sure about how far apart the different levels on the scale are. Sometimes it is obvious that we can’t tell how far apart they are. An example of this might be the scale assigned by a movie reviewer. It is clear that a 4 star movie is better than a 3 star movie, but we can’t say how much better. Other times, when a scale is well defined and the circumstances are right, ordinal data is appropriately, but cautiously treated as interval data.

Measurement Data

The most versatile data is measurement data, which can be split into interval or ratio, depending on whether ratios of numbers have meaning. For example temperature is interval data, as it makes no sense to say that 70 degrees is twice as hot as 35 degrees. Weight, on the other hand, is ratio data, as it is true to say that 70 kg is twice as heavy as 35kg.

A more useful way to split up measurement data, for statistical analysis purposes, is into discrete or continuous data. I had always explained that discrete data was counts, and recorded as whole numbers, and that continuous data was measurements, and could take any values within a range. This definition works to a certain degree, but I recently found a better way of looking at it in the textbook published by Wiley, Chance Encounters, by Wild and Seber.

“In analyzing data, the main criterion for deciding whether to treat a variable as discrete or continuous is whether the data on that variable contains a large number of different values that are seldom repeated or a relatively small number of distinct values that keep reappearing. Variables with few repeated values are treated as continuous. Variables with many repeated values are treated as discrete.”

An example of this is the price of apps in the App store. There are only about twenty prices that can be charged – 0.99, 1.99, 2.99 etc. These are neither whole numbers, nor counts, but as you cannot have a price in between the given numbers, and there is only a small number of possibilities, this is best treated as discrete data. Conversely, the number of people attending a rock concert is a count, and you cannot get fractions of people. However, as there is a wide range of possible values, and it is unlikely that you will get exactly the same number of people at more than one concert, this data is actually continuous.

Maybe I need to redo my video now, in light of this!

And please take a look at our new course. If you are an instructor, you might like to recommend it for your students.

A Statistics-centric curriculum

Calculus is the wrong summit of the pyramid.

“The mathematics curriculum that we have is based on a foundation of arithmetic and algebra. And everything we learn after that is building up towards one subject. And at top of that pyramid, it’s calculus. And I’m here to say that I think that that is the wrong summit of the pyramid … that the correct summit — that all of our students, every high school graduate should know — should be statistics: probability and statistics.”

Ted talk by Arthur Benjamin in February 2009. Watch it – it’s only 3 minutes long.

He’s right, you know.

And New Zealand would be the place to start. In New Zealand, the subject of statistics is the second most popular subject in our final year of schooling, with a cohort of 12,606. By comparison, the cohort for  English is 16,445, and calculus has a final year cohort of 8392, similar in size to Biology (9038), Chemistry (8183) and Physics (7533).

Some might argue that statistics is already the summit of our curriculum pyramid, but I would see it more as an overly large branch that threatens to unbalance the mathematics tree. I suspect many maths teachers would see it more as a parasite that threatens to suck the life out of their beloved calculus tree. The pyramid needs some reconstruction if we are really to have a statistics-centric curriculum. (Or the tree needs pruning and reshaping – I think I have too many metaphors!)

Statistics-centric curriculum

So, to use a popular phrase, what would a statistics-centric curriculum look like? And what would be the advantages and disadvantages of such a curriculum? I will deal with implementation issues later.

To start with, the base of the pyramid would look little different from the calculus-pinnacled pyramid. In the early years of schooling the emphasis would be on number skills (arithmetic), measurement and other practical and concrete aspects. There would also be a small but increased emphasis on data collection and uncertainty. This is in fact present in the NZ curriculum. Algebra would be introduced, but as a part of the curriculum, rather than the central idea. There would be much more data collection, and probability-based experimentation. Uncertainty would be embraced, rather than ignored.

In the early years of high school, probability and statistics would take a more central place in the curriculum, so that students develop important skills ready for their pinnacle course in the final two years. They would know about the statistical enquiry cycle, how to plan and collect data and write questionnaires.  They would perform their own experiments, preferably in tandem with other curriculum areas such as biology, food-tech or economics. They would understand randomness and modelling. They would be able to make critical comments about reports in the media . They would use computers to create graphs and perform analyses.

As they approach the summit, most students would focus on statistics, while those who were planning to pursue a career in engineering would also take calculus. In the final two years students would be ready to build their own probabilistic models to simulate real-world situations and solve problems. They would analyse real data and write coherent reports. They would truly understand the concept of inference, and why confidence intervals are needed, rather than calculating them by hand or deriving formulas.

There is always a trade-off. Here is my take on the skills developed in each of the curricula.

Calculus-centric curriculum

Statistics-centric curriculum

Logical thinking Communication
Abstract thinking Dealing with uncertainty and ambiguity
Problem-solving Probabilistic models
Modelling (mainly deterministic) Argumentation, deduction
Proof, induction Critical thinking
Plotting deterministic graphs from formulas Reading and creating tables and graphs from data

I actually think you also learn many of the calc-centric skills in the stats-centric curriculum, but I wanted to look even-handed.

Implementation issues

Benjamin suggests, with charming optimism, that the new focus would be “easy to implement and inexpensive.”  I have been a very interested observer in the implementation of the new statistics curriculum in New Zealand. It has not happened easily, being inexpensive has been costly, and there has been fallout. Teachers from other countries (of which there are many in mathematics teaching in NZ) have expressed amazement at how much the NZ teachers accept with only murmurs of complaint. We are a nation with a “can do” attitude, who, by virtue of small population and a one-tier government, can be very flexible. So long as we refrain from following the follies of our big siblings, the UK, US and Australia, NZ has managed to have a world-class education system. And when a new curriculum is implemented, though there is unrest and stress, there is seldom outright rebellion.

In my business, I get the joy of visiting many schools and talking with teachers of mathematics and statistics. I am fascinated by the difference between schools, which is very much a function of the head of mathematics and principal. Some have embraced the changes in focus, and are proactively developing pathways to help all students and teachers to succeed. Others are struggling to accept that statistics has a place in the mathematics curriculum, and put the teachers of statistics into a ghetto where they are punished with excessive marking demands.

The problem is that the curriculum change has been done “on the cheap”. As well as being small and nimble, NZ is not exactly rich. The curriculum change needed more advisors, more release time for teachers to develop and more computer power. These all cost. And then you have the problem of “me too” from other subjects who have had what they feel are similar changes.

And this is not really embracing a full stats-centric curriculum. Primary school teachers need training in probability and statistics if we are really to implement Benjamin’s idea fully. The cost here is much greater as there are so many more primary school teachers. It may well take a generation of students to go through the curriculum and enter back as teachers with an improved understanding.

Computers make it possible

Without computers the only statistical analysis that was possible in the classroom was trivial. Statistics was reduced to mechanistic and boring hand calculation of light-weight statistics and time-filling graph construction. With computers, graphs and analysis can be performed at the click of a mouse, making graphs a tool, rather than an endpoint. With computing power available real data can be used, and real problems can be addressed. High level thinking is needed to make sense and judgements and to avoid wrong conclusions.

Conversely, the computer has made much of calculus superfluous. With programs that can bash their way happily through millions of iterations of a heuristic algorithm, the need for analytic methods is seriously reduced. When even simple apps on an iPad can solve an algebraic equation, and Excel can use “What if” to find solutions, the need for algebra is also questionable.

Efficient citizens

In H.G. Wells’ popular but misquoted words, efficient citizenry calls for the ability to make sense of data. As the science fiction-writer that he was, he foresaw the masses of data that would be collected and available to the great unwashed. The levelling nature of the web has made everyone a potential statistician.

According to the engaging new site from the ASA, “This is statistics”, statisticians make a difference, have fun, satisfy curiosity and make money. And these days they don’t all need to be good at calculus.

Let’s start redesigning our pyramid.

Sampling error and non-sampling error

The subject of statistics is rife with misleading terms. I have written about this before in such posts as Teaching Statistical Language and It is so random. But the terms sampling error and non-sampling error win the Dr Nic prize for counter-intuitivity and confusion generation.

Confusion abounds

To start with, the word error implies that a mistake has been made, so the term sampling error makes it sound as if we made a mistake while sampling. Well this is wrong. And the term non-sampling error (why is this even a term?) sounds as if it is the error we make from not sampling. And that is wrong too. However these terms are used extensively in the NZ statistics curriculum, so it is important that we clarify what they are about.

Fortunately the Glossary has some excellent explanations:

Sampling Error

“Sampling error is the error that arises in a data collection process as a result of taking a sample from a population rather than using the whole population.

Sampling error is one of two reasons for the difference between an estimate of a population parameter and the true, but unknown, value of the population parameter. The other reason is non-sampling error. Even if a sampling process has no non-sampling errors then estimates from different random samples (of the same size) will vary from sample to sample, and each estimate is likely to be different from the true value of the population parameter.

The sampling error for a given sample is unknown but when the sampling is random, for some estimates (for example, sample mean, sample proportion) theoretical methods may be used to measure the extent of the variation caused by sampling error.”

Non-sampling error:

“Non-sampling error is the error that arises in a data collection process as a result of factors other than taking a sample.

Non-sampling errors have the potential to cause bias in polls, surveys or samples.

There are many different types of non-sampling errors and the names used to describe them are not consistent. Examples of non-sampling errors are generally more useful than using names to describe them.

 

And it proceeds to give some helpful examples.

These are great definitions, and I thought about turning them into a diagram, so here it is:

Table summarising types of error.

Table summarising types of error.

And there are now two videos to go with the diagram, to help explain sampling error and non-sampling error.

Video about sampling error

 Video about non-sampling error

One of my earliest posts, Sampling Error Isn’t, introduced the idea of using variation due to sampling and other variation as a way to make sense of these ideas. The sampling video above is based on this approach.

Students need lots of practice identifying potential sources of error in their own work, and in critiquing reports. In addition I have found True/False questions surprisingly effective in practising the correct use of the terms. Whatever engages the students for a time in consciously deciding which term to use, is helpful in getting them to understand and be aware of the concept. Then the odd terminology will cease to have its original confusing connotations.

Teaching random variables and distributions

Why do we teach about random variables, and why is it so difficult to understand?

Probability and statistics go together pretty well and basic probability is included in most introductory statistics courses. Often maths teachers prefer the probability section as it is more mathematical than inference or exploratory data analysis. Both probability and statistics deal with the idea of uncertainty and chance, statistics mostly being about what has happened, and probability about what might happen. Probability can be, and often is, reduced to fun little algebraic puzzles, with little link to reality. But a sound understanding of the concept of probability and distribution, is essential to H.G. Wells’s “efficient citizen”.

When I first started on our series of probability videos, I wrote about the worth of probability. Now we are going a step further into the probability topic abyss, with random variables. For an introductory statistics course, it is an interesting question of whether to include random variables. Is it necessary for the future marketing managers of the world, the medical practitioners, the speech therapists, the primary school teachers, the lawyers to understand what a random variable is? Actually, I think it is. Maybe it is not as important as understanding concepts like risk and sampling error, but random variables are still important.

Random variables

Like many concepts in our area, once you get what a random variable is, it can be hard to explain. Now that I understand what a random variable is, it is difficult to remember what was difficult to understand about it. But I do remember feeling perplexed, trying to work out what exactly a random variable was. The lecturers use the term freely, but I remember (many decades ago) just not being able to pin down what a random variable is. And why it needed to exist.

To start with, the words “random variable” are difficult on their own. I have dedicated an entire post to the problems with “random”, and in the writing of it, discovered another inconsistency in the way that we use the word. When we are talking about a random sample, random implies equal likelihood. Yet when we talk about things happening randomly, they are not always equally likely. The word “variable” is also a problem. Surely all variables vary? Students may wonder what a non-random variable is – I know I did.

I like to introduce the idea of variables, as part of mathematical modelling. We can have a simple model:

Cost of event = hall hire + per capita charge x number of guests.

In this model, the hall hire and per capita charge are both constants, and the number of guests is a variable. The cost of the event is also a variable, and can be expressed as a function of the number of guests. And vice versa! Now if we know the number of guests, we can then calculate the cost of the event. But the number of guests may be uncertain – it could be something between 100 and 120. It is thus a random variable.

Another way to look at a random variable is to come from the other direction – start with the random part and add the variable part. When something random happens, sometimes the outcome is discrete and non-numerical, such as the sex of a baby, the colour of a tulip, or the type of fruit in a lunchbox. But when the random outcome is given a value, then it becomes a random variable.

Distributions

Pictorial representation of different distributions

Pictorial representation of different distributions

Then we come to distributions. I fear that too often distributions are taught in such a way that students believe that the normal or bell curve is a property guiding the universe, rather than a useful model that works in many different circumstances. (Rather like Adam Smith’s invisible hand that economists worship.) I’m pretty sure that is what I believed for many years, in my fog of disconnected statistical concepts. Somewhat telling, is the tendency for examples to begin with the words, “The life expectancy of a particular brand of lightbulb is normally distributed with a mean of …” or similar. Worse still, they don’t even mention the normal distribution, and simply say “The mean income per household in a certain state is $9500 with a standard deviation of $1750. The middle 95% of incomes are between what two values?” Students are left to assume that the normal distribution will apply, which in the second case is only a very poor approximation as incomes are likely to be skewed. This sloppy question-writing perpetuates the idea of the normal distribution as the rule that guides the universe.

Take a look at the textbook you use, and see what language it uses when asking questions about the normal distribution. The two examples above are from a popular AP statistics test preparation text.

I thought I’d better take a look at what Khan Academy did to random variables. I started watching the first video and immediately got hit with the flipping coin and rolling dice. No, people – this is not the way to introduce random variables! No one cares how many coins are heads. And even worse he starts with a zero/one random variable because we are only flipping one coin. And THEN he says that he could define a head as 100 and tail as 703 and…. Sorry, I can’t take it anymore.

A good way to introduce random variables

After LOTS of thinking and explaining, and trying stuff out, I have come up with what I think is a revolutionary and fabulous way to introduce random variables and distributions. You can see it for yourself. To begin with we use a discrete empirical distribution to illustrate the idea of a random variable. The random variable models the number of ice creams per customer.

Then we use that discrete distribution to teach about expected value and standard deviation, and combining random variables.


The third video introduces the idea of families of distributions, and shows how different distributions can be used to model the same random process.

Another unusual feature, is the introduction of the triangular distribution, which is part of the New Zealand curriculum. You can read here about the benefits of teaching the triangular distribution.

I’m pretty excited about this approach to teaching random variables and distributions. I’d love some feedback about it!

 

Dr Nic goes to ICOTS9

I had a great time at ICOTS9. Academic conferences are a bit of a lottery, but ICOTS is two for two for me. Both ICOTS8 and ICOTS9 were winners – enjoyable, interesting and inspiring.  I’ve just returned from ICOTS9 in Flagstaff, Arizona, several kg heavier, with lots of ideas for teaching and our videos, and feeling supported in the work I am doing on this blog, and with our resources and videos. I have met smart, good people who are genuinely trying to make things better in the world, by helping people learn about statistics.

Most times when I go to an academic conference, I feel happy if I get to one good, understandable and inspiring session per day. ICOTS conferences are different. I attended every session and just about all of the papers and presentations gave me something to take home.

Some aspects of conferences are communal, and some are more individual. The Keynote speakers provide a shared experience to discuss. The keynote speakers at ICOTS9 were all interesting and inspiring, and the highlight was Sir David Spiegelhalter. I’m a bit disappointed I didn’t get a photo with him, but I was hurrying off on the Grand Canyon expedition. (I was going to call it “Dr Nic meets Sir David”!) Monday’s keynote was with Pedro Silva, giving advice on how to maintain “fitness” as a professional statistician. I was impressed at how he saved up his own money to attend conferences and learn things. I am putting into place my own development plan. Then on Tuesday Sir David Spiegelhalter gave us a modified version of what he gives to schoolkids. Bacon and breast-cancer screening provided examples of risk interpretation. He reiterated the necessity of using frequencies rather than probabilities in communicating and working with questions of risk. The highlight for me, which I also tweeted, was Sir David’s statement that “combinatorics has no place in a course on probability.” I also appreciated his analysis of the PISA results, which are given far too much weight, when changes may nearly all be attributable to chance.

Wednesday’s keynote speaker, Rachel Fewster, was from New Zealand and gave some great examples of team-based learning at post-secondary level schooling. In fact the course she was talking about was second-year uni, equivalent to Junior year in the USA. I liked the idea of baking dice, and seeing if it changed the probabilities. Her students’ video on sample size effect was particularly engaging. Another idea that appealed to me was to randomly assign students to be spies and agents, and then use their results based on different criteria to detect how many spies and agents there were in each group. I’m sure there are many applications for such an activity.

Zalman Usiskin was the keynote on Thursday and discussed the integration of statistics into the whole school curriculum. I was fascinated by his epic effort to count all the numbers in a newspaper. There were 13,518 in the 64 pages of the main six sections of the newspaper. This included interesting problems of definition, which would provide some good discussion in a class activity. And finally, Friday’s keynote address was by Ronald Wasserstein, of the American Statistical Association. He teased us with the promise of a new website to be launched in August, thisisstatistics.org. This site is designed to give students a better idea of the prospects of a career in statistics. He also stressed that the most important skills for statisticians are not technical. We need to be able to communicate, collaborate, plan our career and develop leadership capacity.

In the parallel session, for me there were two main themes, which is probably because I chose sessions with these topics! I learned how younger students learn and understand probability, and the use of simulations and bootstrapping in teaching inference.

Highlights

  • The food, the excursion, the people – all excellent and memorable.
  • I was very excited that the paper I found most inspiring was also chosen for a prize.  Christoph Till did some interesting and rigorous experimentation to see how younger students understand ideas of uncertainty and risk, and if an intervention could improve that understanding. You can see the paper here. http://icots.net/9/proceedings/pdfs/ICOTS9_8I3_TILL.pdf One idea that appealed to me, was getting students to come up with their own scenarios around risk and probability.
  • The work by the people at Ludwigsburg is innovative and important and looks like fun.
  • I found out more about AP statistics, and was enticed by the idea of being an AP reader.
  • Tea and toast: Statisticians seem to be obsessed by dropping toast and seeing if the butter side goes down and by a lady who thinks she can detect if the milk was added before or after the tea. Many of the risk examples involve screening for different forms of cancer, and it would be nice if we can move as well to other scenarios such as lie-detectors and recruitment strategies.
  • The Island is still available for use as a teaching tool at RMIT, and we may be able to work with researchers to explore ways in which the virtual world can be used to teach different concepts. We have access to willing subjects, and they have the tools to assess and develop understanding.
  • I met lots of great people, including many who read this blog, and fellow tweeters. Hi! There is a wonderful atmosphere of cooperation at an ICOTS Conference.
  • One thing I was dying to find out, was where the next one will be held. ICOTS10 in 2018 is to be held in Kyoto, Japan. For once I won’t be too far from my time zone. I hope I see many of you there!

Big thanks to all the team, led by Roxy Peck and Roy St Laurent.

Roy, Dr Nic and Roxy at ICOTS9

It is so random! Or is it? The meaning of randomness

The concept of “random” is a tough one.

First there is the problem of lexical ambiguity. There are colloquial meanings for random that don’t totally tie in with the technical or domain-specific meanings for random.

Then there is the fact that people can’t actually be random.

Then there is the problem of equal chance vs displaying a long-term distribution.

And there is the problem that there are several conflicting ideas associated with the word “random”.

In this post I will look at these issues, and ask some questions about how we can better teach students about randomness and random sampling. This problem exists for many domain specific terms, that have colloquial meanings that hinder comprehension of the idea in question. You can read about more of these words, and some teaching ideas in the post, Teaching Statistical Language.

Lexical ambiguity

First there is lexical ambiguity. Lexical ambiguity is a special term meaning that the word has more than one meaning. Kaplan, Rogness and Fisher write about this in their 2014 paper “Exploiting Lexical Ambiguity to help students understand the meaning of Random.” I recently studied this paper closely in order to present the ideas and findings to a group of high school teachers. I found the concept of leveraging lexical ambiguity very interesting. As a useful intervention, Kaplan et al introduced a picture of “random zebras” to represent the colloquial meaning of random, and a picture of a hat to represent the idea of taking a random sample. I think it is a great idea to have pictures representing the different meanings, and it might be good to get students to come up with their own.

Representations of the different meanings of the word, random.

Representations of the different meanings of the word, random.

So what are the different meanings for random? I consulted some on-line dictionaries.

Different meanings

Without method

The first meaning of random describes something happening without pattern, method or conscious decision. An example is “random violence”.
Example: She dressed in a rather random faction, putting on whatever she laid her hand on in the dark.

Statistical meaning

Most on-line dictionaries also give a statistical definition, which includes that each item has an equal probability of being chosen.
Example: The students’ names were taken at random from a pile, to decide who would represent the school at the meeting.

Informal or colloquial

One meaning: Something random is either unknown, unidentified, or out of place.
Example: My father brought home some random strangers he found under a bridge.

Another colloquial meaning for random is odd and unpredictable in an amusing way.
Example: My social life is so random!

People cannot be random

There has been considerable research into why people cannot provide a sequence of random numbers that is like a truly randomly generated sequence. In our minds we like things to be shared out evenly and the series will generally have fewer runs of the same number.

Animals aren’t very random either, it seems. Yesterday I saw a whole lot of sheep in a paddock, and while they weren’t exactly lined up, there was a pretty similar distance between all the sheep.

Equal chance vs long-term distribution

In the paper quoted earlier, Kaplan et al used the following definition of random:

“We call a phenomenon random if individual outcomes are uncertain, but there is nonetheless a regular distribution of outcomes in a large number of repetitions.” From Moore (2007) The Basic Practice of Statistics.

Now to me, that does not insist that each outcome be equally likely, which matches with my idea of randomness. In my mind, random implies chance, but not equal likelihood. When creating simulation models we would generate random variates following all sorts of distributions. The outcomes would be far from even, but in the long run they would display a distribution similar to the one being modelled.

Yet the dictionaries, and the later parts of the Kaplan paper insist that randomness requires equal opportunity to be chosen. What’s a person to do?

I propose that the meaning of the adjective, “random” may depend on the noun that it is qualifying. There are random samples and random variables. There is also randomisation and randomness.

A random sample is a sample in which each object has an equal opportunity of being chosen, and each choice of object is by chance, and independent of the previous objects chosen. A random variable is one that can take a number of values, and will generally display a pattern of outcomes similar to a given distribution.

I wonder if the problem is that randomness is somehow equated with fairness. Our most familiar examples of true randomness come from gambling, with dice, cards, roulette wheels and lotto balls. In each case there is the requirement that each outcome be equally likely.

Bearing in mind the overwhelming evidence that the “statistical meaning” of randomness includes equality, I begin to think that it might not really matter if people equate randomness with equal opportunity.

However, if you think about medical or hazard risk, the story changes. Apart from known risk increasing factors associated with lifestyle, whether a person succumbs to a disease appears to be random. But the likelihood of succumbing is not equal to the likelihood of not succumbing. Similarly there is a clear random element in whether a future child has a disability known to be caused by an autorecessive gene. It is definitely random, in that there is an element of chance, and that the effects on successive children are independent. But the probability of a disability is one in four. I suppose if you look at the outcomes as being which children are affected, there is an equal chance for each child.

But then think about a “lucky dip” containing many cheap prizes and a few expensive prizes. The choice of prize is random, but there is not an even chance of getting a cheap prize or an expensive prize.

I think I have mused enough. I’m interested to know what the readers think. Whatever the conclusion is, it is clear that we need to spend some time making clear to the students what is meant by randomness, and a random sample.

 

Introducing Probability

I have a guilty secret. I really love probability problems. I am so happy to be making videos about probability just now, and conditional probability and distributions and all that fun stuff. I am a little disappointed that we won’t be doing decision trees with Bayesian review, calculating EVPI. That is such fun, but I gave up teaching that some years ago.

The reason probability is fun is because it is really mathematics, and puzzles and logic. I love permutations and combinations too – there is something cool about working out how many ways something can happen.

So why should I feel guilty? Well, in all honesty I have to admit that there is very little need for most of that in a course about statistics at high-school or entry level university. When I taught statistical methods for management, we did some probability, but only from an applied viewpoint, and we never touched intersection and union signs or anything like that. We applied some distributions, but without much theoretical underpinning.

The GAISE (Guidelines for Assessment and Instruction in Statistics Education) Report says, “Teachers and students must understand that statistics and probability are not the same. Statistics uses probability, much as physics uses calculus.”

The question is, why do we teach probability – apart from the fact that it’s fun and makes a nice change from writing reports on time series and bivariate analysis, inference and experiments. The GAISE report also says, “Probability is an important part of any mathematical education. It is a part of mathematics that enriches the subject as a whole by its interactions with other uses of mathematics. Probability is an essential tool in applied mathematics and mathematical modeling. It is also an essential tool in statistics.”

The concept of probability is as important as it is misunderstood. It is vital to have an understanding of the nature of chance and variation in life, in order to be a well-informed, (or “efficient”) citizen. One area in which this is extremely important is in understanding risk and relative risk. When a person is told that their chances of dying of some rare disease have just doubled, it is important that they know that it may be because they have gone from one chance in a million to two chances in a million. Sure it has doubled, but it still is pretty trivial. An understanding of probability is also important in terms of gambling and resistance to the allures of games of chance. And more socially acceptable gambling, such as stockmarket trading, also requires an understanding of chance and variation.

The concept of probability is important, and a few rules of probability may help with understanding, but I suspect the mathematicians get carried away and create problems that are unlikely (probability close to zero) to ever occur in reality. Anything requiring a three-way Venn Diagram has moved from applied problem to logic puzzle.This is in stark contrast to the very applied data-driven approach used in teaching statistics in New Zealand.

Teaching Probability

The traditional approach to teaching probability is to start with the coin and the dice and the balls in the urns. As well as being mind-bogglingly boring and pointless, this also projects an artificial certainty about the probabilities, which is confusing when we start discussing models. If you look at the Khan Academy videos (but don’t) you will find trivial examples about coloured balls or sweets or strangely complex problems involving hitting a circular target. The traditional approach is also to teach probability as truth. “The probability of getting a boy is one-half”. What does that even mean?

I am currently reading the new Springer volume, Probabilistic Thinking, and intend to write a review and post it on this blog, if I can get through enough before my review copy expires. It is inspiring and surprisingly gripping (but I don’t think that is enough of a review to earn me a hard copy to keep.). There are many great ideas for teaching in it, that I hope to pass on in due time.

The New Zealand approach to teaching probability comes from a modelling perspective, right from the start. At level 1, the first two years of schooling, children are exploring chance situations, playing games with a chance element and describing possible outcomes. By years 5 and 6 they are assigning numeric values to the likelihood of an occurrence. They (in the curriculum) are being introduced to model estimates and experimental estimates of probability. Bearing in mind how difficult high school maths teachers are finding the new approach, I don’t have a lot of confidence that the primary teachers are equipped yet to make the philosophical changes, let alone enact them in the classroom.

We are developing a whole series of videos, teaching probability from a modelling perspective. I am particularly pleased with the second one, which introduces model estimates of probability with an example with clear and logical assumptions, rather than the contrived “We assume the coin is fair”. I am hoping that with these videos we can help students and teachers embrace a more model-based approach – and no one will ever say “The weights of the lemons follow a normal distribution.” I also hope I can do this and still leave the fun in there.


Support Dr Nic and Statistics Learning Centre videos

This is a short post, sometimes called e-begging!
I had been toying with the idea of a Kickstarter project, as a way for supporters of my work to help us keep going. Kickstarter is a form of crowd-sourcing, which lets a whole lot of people each contribute a little bit to get a project off the ground.

But we don’t really have one big project, but rather a stream of videos and web-posts to support the teaching and learning of statistics. Patreon provides a more incremental way for appreciative fans to support the work of content creators.

You can see a video about it here:

And here is a link to the Patreon page: Link to Patreon

Rather than producing for one big publishing company, who then hold the rights to our material, we would love to keep making our content freely available to all. You can help, with just a few dollars per video.

A helpful structure for analysing graphs

Mathematicians teaching English

“I became a maths teacher so I wouldn’t have to mark essays”
“I’m having trouble getting the students to write down their own ideas”
“When I give them templates I feel as if it’s spoon-feeding them”

These are comments I hear as I visit mathematics teachers who are teaching the new statistics curriculum in New Zealand. They have a point. It is difficult for a mathematics teacher to teach in a different style. But – it can also be rewarding and interesting, and you never get asked, “Where is this useful?”

The statistical enquiry cycle shown in this video provides a structure for all statistical investigations and learning.

We start with a problem or question, and undergo an investigation, either using extant data, an experiment or observational study to answer the question. Writing skills are key in several stages of the cycle. We need to be able to write an investigative question (or hypotheses). We need to write down a plan, and sometimes an entire questionnaire. We need to write down what we find in the analysis and we need to write a conclusion to answer the original question. That’s a whole heap of writing!

And for teachers who may not be all that happy about writing themselves, and students who chose mathematical subjects to avoid writing, it can be a bridge too far.
In previous posts on teaching report writing I promote the use of templates, and give some teaching suggestions.

In this post I am concentrating on analysing graphs, using a handy acronym, OSEM. OSEM was developed by Jeremy Brocklehurst from Lincoln High School near Christchurch NZ. There are other acronyms that would work just as well, but we like this one, not the least for its link with kiwi culture. We think it is awesome (OSEM). You could Google “o for awesome”, to get the background. OSEM stands for Obvious, Specific, Evidence and Meaning. It is a process to follow, rather than a checklist.

The following video takes you a step at a time through analysing a dotplot/boxplot output from iNZight (or R). Through the example, students see how to apply OSEM when examining position, spread, shape and special features of a graph. This helps them to be thorough in their analysis. For the example we use real data. Often the examples in textbooks are too neat, and when students are confronted with the messiness of reality, they don’t know what to say.

I like the use of O for obvious. I think students can be scared to say what they think might be too obvious, and look for tricky things. By including “obvious” in the process, it allows them to write about the important, and usually obvious features of a graph. I also like the emphasis on meaning, Unless the analysis of the data links back to the context and purpose of the investigation, it is merely a mathematical exercise.

Is this spoon-feeding? Far from it. We are giving students a structure that will help them to analyse any graph, including timeseries, scatter plots, and histograms, as well as boxplots and dotplots. It emphasises the use of quantitative information, linked with context. There is nothing revolutionary about it, but I think many statistics teachers may find it helpful as a way to breakdown and demystify the commenting process.

Class use of OSEM

In a class setting, OSEM is a helpful framework for students to work in groups. Students individually (perhaps on personal whiteboards) write down something obvious about the graph. Then they share answers in pairs, and decide which one to carry on with. In the pair they specify and give evidence for their “obvious” statement. Then the pairs form groups of four, and they come up with statements of meaning, that are then shared with the class as a whole.

Spoon feeding has its place

On a side-note – spoon-feeding is a really good way to make sure children get necessary nutrition until they learn to feed themselves. It is preferable to letting them starve before they get the chance to develop sufficient skills and co-ordination to get the food to their mouths independently.

Teaching Confidence Intervals

If you want your students to understand just two things about confidence intervals, what would they be?

What and what order

When making up a teaching plan for anything it is important to think about whom you are teaching, what it is you want them to learn, and what order will best achieve the most important desired outcomes. In my previous life as a university professor I mostly taught confidence intervals to business students, including MBAs. Currently I produce materials to help teach high school students. When teaching business students, I was aware that many of them had poor mathematics skills, and I did not wish that to get in the way of their understanding. High School students may well be more at home with formulas and calculations, but their understanding of the outside world is limited. Consequently the approaches for these two different students may differ.

Begin with the end in mind

I use the “all of the people, some of the time” principle when deciding on the approach to use in teaching a topic. Some of the students will understand most of the material, but most of the students will only really understand some of the material, at least the first time around. Statistics takes several attempts before you approach fluency. Generally the material students learn will be the material they get taught first, before they start to get lost. Therefore it is good to start with the important material. I wrote a post about this, suggesting starting at the very beginning is not always the best way to go. This is counter-intuitive to mathematics teachers who are often very logical and wish to take the students through from the beginning to the end.

At the start I asked this question – if you want your students to understand just two things about confidence intervals, what would they be?

To me the most important things to learn about confidence intervals are what they are and why they are needed. Learning about the formula is a long way down the list, especially in these days of computers.

The traditional approach to teaching confidence intervals

A traditional approach to teaching confidence intervals is to start with the concept of a sampling distribution, followed by calculating the confidence interval of a mean using the Z distribution. Then the t distribution is introduced. Many of the questions involve calculation by formula. Very little time is spent on what a confidence interval is and why we need them. This is the order used in many textbooks. The Khan Academy video that I reviewed in a previous post does just this.

A different approach to teaching confidence intervals

My approach is as follows:
Start with the idea of a sample and a population, and that we are using a sample to try to find out an unknown value from the population. Show our video about understanding a confidence interval. One comment on this video decried the lack of formulas. I’m not sure what formulas would satisfy the viewer, but as I was explaining what a confidence interval is, not how to get it, I had decided that formulas would not help.

The new New Zealand school curriculum follows a process to get to the use of formal confidence intervals. Previously the assessment was such that a student could pass the confidence interval section by putting values into formulas in a calculator. In the new approach, early high school students are given real data to play with, and are encouraged to suggest conclusions they might be able to draw about the population, based on the sample. Then in Year 12 they start to draw informal confidence intervals, based on the sample. This uses a simple formula for the confidence interval of a median and is shown in the following video:

Then in Year 13, we introduce bootstrapping as an intuitively appealing way to calculate confidence intervals. Students use existing data to draw a conclusion about two medians. This video goes through how this works and how to use iNZight to perform the calculations.

In a more traditional course, you could instead use the normal-based formula for the confidence interval of a mean. We now have a video for that as well.

You could then examine the idea of the sampling distribution and the central limit theorem.

The point is that you start with getting an idea of what a confidence interval is, and then you find out how to find one, and then you start to find out the theory underpinning it. You can think of it as successive refinement. Sometimes when we see photos downloading onto a device, they start off blurry, and then gradually become clearer as we gain more information. This is a way to learn a complex idea, such as confidence intervals. We start with the big picture, and not much detail, and then gradually fill out the details of the how and how come of the calculations.

When do we teach the formulas?

Some teachers believe that the students need to know the formulas in order to understand what is going on. This is probably true for some students, but not all. There are many kinds of understanding, and I prefer a conceptual and graphical approaches. If formulas are introduced at the end of the topic, then the students who like formulas are satisfied, and the others are not alienated. Sometimes it is best to leave the vegetables until last! (This is not a comment on the students!)

For more ideas about teaching confidence intervals see other posts:
Good, bad and wrong videos about confidence intervals
Confidence Intervals: informal, traditional, bootstrap
Why teach resampling