Sampling error and non-sampling error

The subject of statistics is rife with misleading terms. I have written about this before in such posts as Teaching Statistical Language and It is so random. But the terms sampling error and non-sampling error win the Dr Nic prize for counter-intuitivity and confusion generation.

Confusion abounds

To start with, the word error implies that a mistake has been made, so the term sampling error makes it sound as if we made a mistake while sampling. Well this is wrong. And the term non-sampling error (why is this even a term?) sounds as if it is the error we make from not sampling. And that is wrong too. However these terms are used extensively in the NZ statistics curriculum, so it is important that we clarify what they are about.

Fortunately the Glossary has some excellent explanations:

Sampling Error

“Sampling error is the error that arises in a data collection process as a result of taking a sample from a population rather than using the whole population.

Sampling error is one of two reasons for the difference between an estimate of a population parameter and the true, but unknown, value of the population parameter. The other reason is non-sampling error. Even if a sampling process has no non-sampling errors then estimates from different random samples (of the same size) will vary from sample to sample, and each estimate is likely to be different from the true value of the population parameter.

The sampling error for a given sample is unknown but when the sampling is random, for some estimates (for example, sample mean, sample proportion) theoretical methods may be used to measure the extent of the variation caused by sampling error.”

Non-sampling error:

“Non-sampling error is the error that arises in a data collection process as a result of factors other than taking a sample.

Non-sampling errors have the potential to cause bias in polls, surveys or samples.

There are many different types of non-sampling errors and the names used to describe them are not consistent. Examples of non-sampling errors are generally more useful than using names to describe them.

And it proceeds to give some helpful examples.

These are great definitions, and I thought about turning them into a diagram, so here it is:

Table summarising types of error.

Table summarising types of error.

And there are now two videos to go with the diagram, to help explain sampling error and non-sampling error. Here is a link to the first:

Video about sampling error

 One of my earliest posts, Sampling Error Isn’t, introduced the idea of using variation due to sampling and other variation as a way to make sense of these ideas. The sampling video above is based on this approach.

Students need lots of practice identifying potential sources of error in their own work, and in critiquing reports. In addition I have found True/False questions surprisingly effective in practising the correct use of the terms. Whatever engages the students for a time in consciously deciding which term to use, is helpful in getting them to understand and be aware of the concept. Then the odd terminology will cease to have its original confusing connotations.


16 thoughts on “Sampling error and non-sampling error

    • Hi
      Another way of looking at it is to call it sampling variation. Say the true and unknown population mean weight of something is 55kg. We take a which sample happens to contain items that gave a mean of 52. The sample may be representative and not have much non-sampling error at all, but there is sampling error.

      Or another example could be Lotto balls. In NZ there are 40 lotto balls, numbered from 1 to 40, so the mean of them is 20.5. When 6 balls are drawn randomly, there is no non-sampling error as this is a gambling machine, that requires a high level of attention to eliminating bias and other non-sampling error. However, there is a high likelihood that any sample taken will have a mean different from 20.5. This is sampling error.

      I hope that helps

    • I’m happy you like the blog.
      You can’t have examples of sampling error. Sampling error, or sampling variation, which is a better term for it, exists because you take a sample of the population. Any examples of error you make due to sampling, are in fact non-sampling error.

  1. Hi, can you please let me know – if my population size is 1000 items, out of which I select 100 items and do a quality check on the 100 items, and if I discover 6 errors, is the error percentage 6% (6/100) or 0.6% (6/1000)? I felt the 100 is representative of the 1000, so the errors discovered in the 100 items are taken as having been discovered from out of the 1000 items. Can you throw some light on this please? Thanks.

    • Hi Norbert
      I think you might need to read the post again. Basically any kind of error you think of is likely to be a non-sampling error. Sampling error occurs because the sample is not the whole population.

  2. Pingback: Political polls – why they work – or don’t | Learn and Teach Statistics and Operations Research

  3. Pingback: Political polls – why they work – or don’t | A bunch of data

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s