Embrace Change

I love graduations. At the University of Canterbury the academic staff act as marshals,

Dr Nic in PhD regalia - I love to dress up!

helping the graduands to be in the right place at the right time in the right order wearing the right clothes and doing the right things. I have acted as a marshal for some years and love helping people to have a good experience. I love graduations because of the accomplishment they represent, and the efforts the student, the parents and the staff have made for these young people to complete their qualifications. This graduation was pretty special, as it was the class that had to cope with repeated earthquakes, snowfalls and other disruptions in the last half of their degrees. They are the students who had to adapt to being taught in tents or on-line, and who, at the beginning of each exam, were warned what to do in case of an aftershock and told to keep their wallet, phone and keys with them at all times. They are the students who rallied together to support each other and the community and shovel silt.

I also love the ceremony at graduations and dressing up in fancy clothes. I love the music and singing Gaudeamus. I cry during the National Anthem. And I love the speeches, full of hope and encouragement and advice. This graduation Emeritus Professor John Burroughs spoke and made two points. The first was to know yourself and what you can do and what you can’t. He wanted to be an All Black, but couldn’t. But he became a prominent figure in law in New Zealand. Sadly he said he couldn’t do mathematics, which I wish was an admission seen as similar to saying one couldn’t read. Why is it that people think it is okay to be bad at math? Or even something to be proud of? moving on…

The second point Prof Burroughs made is pertinent to the teaching of Statistics and Operations Research. He recalled the advent of the ballpoint pen when he was at school. Until then he had been at the mercy of dip or fountain pens. Then when the ballpoint pen arrived it revolutionized writing. His teachers weren’t impressed and often insisted that students stick with fountain pens so as not to ruin their penmanship. It was an example of technology and improvement and change and people’s reaction. When he was a lecturer in law at the University of Canterbury he eschewed computers and was probably called a Luddite, though not to his face. In his later career he has had to embrace the new technology, including Facebook, twitter, Google and the like (the Like?). And he has enjoyed it. He wishes he had ridden the wave at the time.

There will always be change. Prof Burroughes’s advice to the graduates was to try to anticipate and enjoy change. Change equals opportunity.

And now I get to the point. The widespread use of powerful computers has changed Statistics and Operations Research. What will not change is change. There will continue to be advances in the accessibility of our disciplines to the masses. And we need to embrace this. When I learned Statistics and Operations Research in the early 1980s there was little computing power available. We used Eton tables, and solved two-variable LPs on cartesian planes. We performed matrix operations and stochastic simulation by hand calculation. We learned Revised Simplex by hand. We used the Poisson approximation of the binomial distribution as that avoided tables going too high. When we used MiniSPSS we were allowed ten runs in which to produce a linear regression, and the emphasis was on the production rather than the interpretation of output.

That was then, and this is now, and I think too many teachers of Statistics and Operations Research have not moved on. There is certainly evidence of this in the textbooks. Recently a colleague and I reviewed all first year Operations Research textbooks, examining their treatment of Linear Programming. One of the textbooks was a later edition of one I had used in 1981. The later edition used the same example to teach LP. Much of what was in these textbooks did not recognize the powerful opportunity the spreadsheet provides to explore and understand models.

I have also been reviewing statistics textbooks, though there are too many to be exhaustive. Statistics textbooks too often are stuck in the days of the fountain pen, rather than embracing the great possibilities that are there with the power of the computer.

I challenge all teachers of Operations Research and Statistics to examine what they do and ask if it is the same way that they were taught. If the answer is yes, then some more thinking is called for. We have such amazing opportunities to teach so much better, to use real data, to make a real difference, that to be stuck in the old methods, using tables and formulas is close to a crime.

Stop faking it! Data should be real.

Use real data when teaching statistics

In statistical analysis the context of the data is integral, not a story added on afterwards to make it more interesting. It is not like algebra where “making it real” means  you make up a reason for the equation, and require the students to give the correct units for the answer. In statistics the analysis involves understanding what is happening in the data.

For this reason, as much as possible, data must be real.

In a previous incarnation I have been guilty of making up data. I was even quite proud of being able to make sure my fake multivariate data displayed heteroscedasticity and multicollinearity. That was fine for an assessment item, I reasoned at the time, as I wanted to make sure that students could recognise those effects.

I recently reviewed a case which had been submitted for publication. The case story was great, with some interesting soft aspects, based on a real-life scenario. Then the second part of the case involved analysing data, which was openly fake. I decided to see how I would go, downloaded the data and started playing around in it. I found it disturbing that there was an R-squared value of more than 99%. Then the more I explored, the worse it got, and the more convinced I was that the problem lay in the generation of the data. This would have caused perplexity for students who really wanted to understand what was going on. It is not acceptable to have badly faked data in a case.

 What is so great about real data?

With appropriate topics, the outcome interests the students. It can cause them to think, and realise that there is a use for statistics. It can be exciting! You can have discussions about why this result might have happened.

An interesting bonus, that you can choose to use or not, is that the data is dirty! (See my post about dirty data). Students learn that data does not arrive beautifully sanitised like the pristine textbook sets. They meet with the problems of real data, so they are better prepared for real data in the real world.

The failings of fake data

  1. Effects may seem really interesting, but they were put there by the instructor (sometimes by mistake) so there is no basis in reality. I see this as rather the equivalent of the movie, “the Truman Show”, where a whole world is generated for Truman Burbank with exactly the events needed to make a television series interesting.  Sure you may find a relationship in the data, but only because you put it there in the first place!
  2. You can get odd artefacts of the generation process. Some interesting pattern shows up when a student looks at the data a different way from what you expect. This pattern could be just because you didn’t think to get rid of it.
  3. Generating good fake data is actually quite tricky to do if you want to get it right.
  4. Using fake data trivialises the statistical process to mechanistic algorithm application. Fake data may be better that numeric data with no context, but not by much.

 Sources of real data

The internet abounds with data. We can just about drown in it. This is one source of data, but it is mostly clean, which removes one of the advantages of real data.

However I prefer to get the data from the students themselves. Each year I have a questionnaire which the students fill out anonymously on-line at the start of the course. Then I use this a source of data for use in class examples, exercises and testing. Over the years I have found some interesting effects among the data from our students. An important thing to remember is to make sure you have a range of levels of data. It is very easy to collect nominal/categorical data, but it’s not much use for teaching regression. Paired difference of two means can also be difficult, so you have to think ahead on that one. Here are some example questions for each level of measurement.

Nominal

  • What type of chocolate do you prefer?
  • What kind of mobile phone do you own?
  • Sex?
  • Nationality?
  • How did you travel to university today?
  • What subject are you majoring in?

 Ordinal

  • How useful do you think this course will be in your future career? (Very useful, somewhat useful, not useful)
  • How successful have you been in mathematics in the past? (Very successful, somewhat successful, not successful)
  • How often do you check Facebook? (More than once a day, about once a day, several times a week, about once a week, less often than once a week.)

Interval

  • How many pairs of trousers do you own?
  • What is the most you have ever paid for a pair of trousers.
  • What annual income do you expect to be earning in ten years’ time?
  • What do you think the average income for the class with be in ten years’ time?
  • How many children would you like to have?
  • What is the ideal age to get married?

Real data in Operations Research

Unfortunately it is more difficult to find real-life problems in OR which can be solved in the classroom. One possible approach is to start with a real-life case, and then provide a cut-down version for the students to work on. When we make up exercises for OR, we search the web to make sure that the figures used are realistic estimations of real costs.

In a lesson on Multi-criteria Decision Making we had the case of locating a landfill. This was especially pertinent as our city had recently gone through the political process to set up a new landfill. A helpful website gave ballpark figures on costs for many of the aspects. With the internet at our fingertips there is no excuse for unrealistic figures.

There is work involved in collecting real data, but if we want students to accept that statistics and operations research are relevant, it must be done.

Hey mathematics – leave the stats alone!

Mathematicians love the elegance of mathematics

Mathematicians love mathematics. They love the elegance and the purity and the abstract nature of it all. Consequently they think there is something not quite nice about the practical real life messiness of statistics. Now this is fine, so long as they keep their prejudices away from their students! I recently met a high school maths teacher who was completely vocal about her dislike for statistics. Fortunately she doesn’t teach the final year statistics course, but she can’t avoid the sections of statistics all through the curriculum at lower levels. It hurt me to hear statistics so disliked.

Elementary school-teachers who dislike mathematics harm the good attitude formation in their pupils. They don’t like maths, and they feel uneasy doing it, and that rubs off. High school teachers are often frustrated by the attitudes with which students arrive at high school. There are moves in New Zealand to address this, through the Numeracy Project, which helps to develop skills in our Primary teachers.

What bothers me is similar. Many, if not most, of our high school teachers are pure mathematicians. Some of them allow their dislike for statistics to colour the students’ experience. Or if they don’t actively dislike statistics, they may still feel ill-at-ease, as they did not get enough background knowledge in their training. They may know the mechanisms, but have no experience of statistical analysis. I know this to be true, as I was once one of them. It is difficult to go from an exact subject like mathematics, where you find x and know when you have found it, to an art/science like statistics, where x changes depending on the context.

However I am now a born-again statistics applier. I hesitate to call myself a statistician, as I don’t use R, and I’m not exactly sure what a moment is. But I know how to do statistics in the real world. I know what you should and shouldn’t do with different data, and I know how important context is. I know that you seldom get a simple random sample, and sometimes your sample is so far from random that you blush, but soldier on anyway. I’m skeptical about Factor Analysis. And I keep learning. Every time I do a real statistical analysis I gain insights into the nature of the discipline. And I love it. Statistics is a detective game. The numbers tell a story, and it is up to us to help them reveal their secrets without so much coercion that they tell us lies to make us go away.

My wish is that pure mathematicians in high schools would accept that statistics is not mathematics and never will be. It is a mathematical science, and needs to be taught differently from mathematics.

George W. Cobb and David S. Moore wrote a paper, “Mathematics, Statistics, and Teaching”, which gives answers to questions such as “how does statistical thinking differ from mathematical thinking? and “What is the role of mathematics in statistics?”. They emphasize that beginning statistics should be taught as statistics. A beginning statistics course should use real data and automated production of graphs and analysis.

Statistics lives in the real world

This is antithetical to a pure mathematician. “Remove the maths and the graphing – or get the computer to do it, and where is the maths?”, they cry! “Exactly!”, reply the statistics teachers.

I hope there are maths (or math) teachers reading this. You can do it – you just need to accept that statistics is NOT mathematics, and learn to see the rigour and excitement in it. Embrace the messiness! Throw off the shackles of finding the one correct answer! Statistics, well-taught, will be more use to most of your students than calculus.

Statistics Textbooks suck out all the fun.

Do the textbook writers like the students?

In 1987 George Cobb published a paper evaluating statistics textbooks. I am very grateful for it, as it alerted me to the problems with textbooks, and introduced me to the man himself, whose work I greatly admire. Cobb explains that statistics is an inherently interesting and practical subject, but that many textbooks seem to have missed that, or concealed it from the students.

The discipline of statistics is inherently fascinating, applied and important. So why do so many textbooks make it seem mechanistic and abstract? I have been examining textbooks, and wonder if the writers even like their subject matter, or the students they are supposed to be reaching.

I am particularly interested in textbooks for non-mathematicians. The majority of students of statistics are not mathematicians, and are not planning to take any more statistics than they are required to. These students don’t like mathematics. They feel uneasy about taking the course. They are required to take a statistics course as part of their business, psychology or health sciences major. They aren’t even sure why they need to take the course, and hope to get it over and done with and forget about the experience as soon as possible. A previous post talks about how to help students who are feeling negatively towards the course. A textbook for these students needs to get the tone and content right.

Tone

A friendly, but authoritative tone is important. Some go too far and become corny in their chattiness. It’s nice to be friendly, but it can be a bit tiresome and the examples can be too cute. But most are just too dry – and have too many words. And far too many equations and algorithms. They seemed bent on protectionism rather than empowerment.

Content

Even more important is the choice of content, and I find this fascinating. I wonder what course some textbooks are designed for. A telling chapter is regression. Regression is an important statistical technique. But what do we tell them about regression? Here is how I have recently seen it done. Provide an example of real data taken from the web. Introduce the problem, then let them wait until the end to find out where you are going. Give the mathematical way of expressing a line, using greek letters. Derive the least squares method of line fitting. Calculate the line by hand. Interpret the slope and the intercept. Calculate the coefficient of determination by hand. Interpret it. Define the residuals, and calculate them. Calculate the F-statistic and t-statistics. Interpret them. Then finish off the story you started at the beginning of the chapter (not that anyone cares anymore).

Some of you may be wondering what is wrong with that. Good – it means I am not preaching to the choir.

Students need to see the whole picture from the beginning. If you absolutely MUST do the mathematics, put it at the end of the chapter for the keen students, but don’t do the maths in the body of the text and scare the others. Do not assume the readers know how to interpret a line. Most don’t. Start with some examples that explain the context, show the line, and explain and apply the model equation. Next work through one example thoroughly, using computer output. Explain the different values and talk about what applies to the sample, and what helps us to generalize to the population. Then provide some more examples, making sure many of them are not statistically significant, some have negative slopes, and all are solving a problem using a sufficiently large sample of real data. Then give them a template for writing up a regression, explaining the different parts. Finally, if you must, you can give them the mathematics. This may keep the instructors happy so that they will buy your book.

There are differing views on finding the mean for ordinal data.

Another telling bit of content is a textbook’s approach to ordinal data. In my video about types of data two instructors argue over whether it is permissible to calculate the mean for ordinal data. It ends with them calling each other “nit-picking mathematician” and “sloppy social scientist”. My approach is to take the middle ground. It is not ideal mathematically to calculate a mean for ordinal data, but much of the time people do, so it is best to know why it may cause problems and that there is an issue, rather than pretending that it never happens. Look in the textbook. I would be wary of any text that states categorically that you cannot find the mean for ordinal data.

There is also the issue of the purpose of the text, both its place in the course, and in the lives of the students. Textbooks can take different roles in courses, largely as a function of the confidence and competence of the instructor. A novice instructor, unsure of the material is well-advised to stick closely to the textbook. But an experienced and engaged instructor will find the text less and less important and more a peripheral second opinion and source of homework exercises. The internet and Wikipedia have replaced the textbook as the source of background knowledge. We suspect a textbook is used more as an expensive combination of talisman and doorstop by the students.

“Judge a book by its exercises and you cannot go far wrong,”  said George Cobb. All exercises in statistics should have context. There is no place for fitting a line by hand calculation to a set of five points with no context. Leave that to mathematics courses. Statistics is about context, and all examples need to reflect that. The data should be real data, so that an interesting result is authentic, not just something dreamed up by the instructor. The data should occasionally be dirty even! (but not too early in the course, without warning). And there should be enough data. Don’t perpetuate bad habits by using too few data.

Having said all this, I do wonder what the role of textbooks is in the education of the future. On-line materials, which can be frequently updated, and crowd-sourced explanations such as found on Wikipedia and elsewhere can fill the place of a textbook.

Or there is always our app – AtMyPace: statistics, which uses video and interactive lessons to teach some important concepts. We are now working to bring this to the web so all can use it. And then maybe I should write a textbook. ;)

Anxiety, fear and antipathy for maths, stats and OR

I love mathematical subjects. I love statistics and teaching statistics, and I love Operations Research and teaching Operations Research. But I do not represent the majority of people in the world and I definitely do not represent the majority of my students who come into my courses.

Many people don't really like mathematics.

People take my courses because they are required to. They don’t really want to do statistics and quantitative methods. However, by the end of the course, many have discovered, to their joy, that they CAN do maths, and actually enjoy it. It is empowering for them and wonderful for me. Emotional students have told me how the course has changed the way they see themselves and mathematical subjects. One young woman had previously failed two traditional statistics courses. However after passing our course, she went on to further stats courses, and eventually had a marketing internship involving data analysis, worked as a tutor on our course and completed a postgraduate degree. This is the letter she sent me after passing the course:

“I just thought I would let you know that I have really enjoyed this course, considering I hated maths this is not to be taken lightly! I was told it would be a good course for me to take but was slightly sceptical. However, I think being able to continuously see your progress and results gave me a lot of motivation and a great sense of achievement.

People enjoy succeeding.

“The tutorials were also fantastic, the tutors were always friendly and very helpful and a lot of credit must go to them. Obviously without these tutorials I would not have passed the course.

“Thank you for offering a course that has enabled me to understand and even at times enjoy stats!”

Ideas for helping students overcome antipathy towards mathematical things.

Perserverance

One difference I have noticed between people who succeed at maths and those who don’t, is what they do when they get something wrong. When I do a problem and get a wrong answer, I do not see it as a personal failure, but try again. However people who are less secure in their ability to do maths get upset at each wrong answer, and give up easily so that they can avoid further failure.They seem to take mistakes personally.

We do two things in our course to help with this. We have a large bank of problems for students to try over and over again, and tell them explicitly that we expect them to do the practice test at least seven times before they are ready for the supervised test. This way they see the failure as part of the process, rather than as a reflection of their own inadequacy. Secondly we begin our course with quite easy material and build up to more difficult. This way they start to experience success, and learn that the key to passing is putting in more time.

Relevance

Another issue is that people who dislike mathematics, often do so because it feels irrelevant and a waste of time. Sometimes this is an excuse, but, tied in with the first reason, it is easy to see that people will not spend a lot of time dong something that makes them feel like a failure, and for which they cannot see the purpose. So another thing we do in our course is make sure that every single example is there because it is useful for them, and has a real world application to which they can relate, or at least which they can see is important. For example, our questions on Binomial distribution are based on marketing, human resources, and retailing examples. Our analysis is done on data collected from the students themselves. In our Operations Research course we get them to work out a MCDM scenario related to what they will do at the end of the course. Statistics and Operations Research are inherently interesting and practical, so it should not be difficult to keep them that way.

Borrowed self-efficacy

Self-efficacy is the belief a person has in how well they can accomplish a task. Studies into effective learning have found that the level of self-efficacy a person has regarding a certain subject or learning in general is a good predictor of how well they will do. You could say that it may be quite realistic – that they believe they can do it because they can. But studies have controlled for that and the effect is still there. Now we can’t inject people with self-efficacy, but we can lend it to them. Self-efficacy can be borrowed from the instructor or the course. We tell them that this is a course for people who have previously found maths difficult. We tell them how successful other people like them have been in the course. We tell them how well it is designed and how much we are willing to help them to succeed in learning the material. Students feel this encouragement and take heart from it.

Love

People don’t care how much you know until they know how much you care. I believe you need to love the students. I’m talking here about genuine, respectful love for other human beings. We need to care about them as people, not just students in our classes. We need to love our subject and believe intrinsically in it. This shines through, even when we don’t give face-to-face lectures. One student told me he knew I was a good teacher because I was so thorough. I am thorough because I love the students and want them to succeed.

Love is not a word used often in secular higher-education. However I have been privileged to see many great teachers, whose whole approach was centered in love for the students.

These are my ways to help students who are anxious, fearful or less than keen to be taking my course.  I teach them to see failure as a step to success, build the material up in small steps, make it real, help them develop self-efficacy, and let them feel how much I care. The rewards are so worth the effort!