What is the point of statistics and operations research?

What is the point of what we teach?

“But what use is this?” Through the ages, maths students have whined this to their frustrated teachers. Generally the question is a diversion tactic, to avoid work, but sometimes the question is genuine. It is helpful for teachers to have an answer worked out ahead of time. (“Be quiet and get on with your work!”, isn’t really sufficient) It is preferable to present the material in such a way that the use is so obvious as to devalue the question. And it helps if the material really is useful.

Students like to ask what use something is.

For many teachers the curriculum is a given. They teach what they teach because it is in the textbook, in the exam, in the teaching guide, in the (I’m very sad to say) standardised tests. (The evils of standardised testing are a subject for another day, and don’t really fit with this blog. See Diane Ravitch’s blog if you have any doubt.)

But I have been fortunate to be able to design my own curriculum. Designing a quantitative methods course from a green fields (or is that scorched earth?) starting point is an intriguing challenge. I didn’t have a totally free hand as we had to fit with the standards for an accounting accreditation body. But it was free enough. It made me think about what is important. What is it that I think all students with a business/commerce degree should be able to do? What attitudes would I like to inculcate? What don’t they know already, but need to? I wanted to be able to say exactly why each topic or skill was included. Interestingly the “But what use is it?” question has never come up. In five years of teaching a total of nearly two thousand students, not once has any of them questioned the worth of the material.

Our curriculum includes solving linear equations and plotting graphs, percentages,

Students like seeing things as useful

probability, data representation and, of course, inference. In each instance the practical application is taught before the abstract method. In probability, real and business world examples are used, and in inference we use data generated by the students themselves. All is taught using Excel for calculations, and we even do Pivot-tables and charts.

Quantitative Methods courses are challenging because of the wide range of prior preparation. We take in students who are very good at mathematics, and have already studied some statistics. They may need to be taught the non-exact side of it. We also have students who have been out of formal schooling for some years, and may never have felt confident with numbers. There is always diversity in attitudes, diligence, confidence, competence and prior understanding and misunderstanding.

You cannot turn someone into a statistician in one entry-level statistics or quant methods course. But you can help them to become statistically literate, and critical consumers of statistical analysis. You can demysify the whold process. There are just a few big ideas in statistics, and if students grasp them, they are well on their way to statistical literacy, and possibly statistical critical consumption. This is a good goal for a first-year business statistics course.

Operations Research course

Designing your own Operations Research course is also a challenge. The type of content can depend on the context. Is it in an MBA course, an engineering school or a school of business? Again the prior preparation of the students can dictate the level of material and mode of delivery. Our Management Science course is an optional entry-level course for business students. We spent considerable time deciding what would be of most use to the majority of students who will probably never take another Operations Research course, while at the same time enticing some of the students to do so.

Role of the textbook

The easy way to design a course and select content is to follow a textbook. This is a tried and true method, which has led to uniformity and stagnation. Textbook development is like a snowball rollling down a hill, collecting more and more material and never casting off expired and out-of-date topics and approaches. Designing a course on a textbook stifles innovation and creativity.

The unfortunate off-shoot of designing an innovative course is that the fit with a textbook can be difficult. You may end up needing parts of different texts, to write your own, or as we do, not have a textbook. In these days of bespoke textbooks, and increasingly anxious publishing companies, it is possible to have a book made which included parts of different texts. Universal electronic publishing will make this much more practical.

Payback

The payback for designing your own course is great. You know that each topic is of worth, and fits with the rest of the material. You know the Why of your course, as well as the What and the How. You can be convincing when you tell the students how important it is. And that makes a difference.

Significance

In statistical analysis the word “significant” means that there is evidence that effect found in the sample exists in the population from which the sample was drawn. The choice of the word “significant” is unfortunate, as it is used to mean something different in common language. Reporters hear a scientist say that there is a significant effect, and tend to think big. Results gets reported as significant, meaning big, and we have effect inflation.

Where do p-values come from?

In reality, if we take a large enough sample, even a small effect will show up as significant. Because the sample is large, it is easier to detect and be sure of the existence of small effects in the population. However, this does not mean that the effect is notable or makes any difference.

Unfortunately this confusion is rife in the reporting of medical and educational research. A drug may have a statistically significant effect, which means that there is evidence that it exists in the population, but it may be to reduce incidence from 2 in a thousand to 1 in a thousand, which isn’t really much of a difference. To make matters worse a result like this can also be stated as a 50% decrease, which makes it seem even more miraculous.

This post is more about learning statistics, but those who teach it really need to be alert for this misconception. We have just posted our latest YouTube video explaining significance and usefulness, evidence and strength, association and causation.

Within the video I have tried to give memorable images, which students will hold onto, even if they don’t quite remember the reasoning. The p-value shrinking as the Evidence label grows, aims to help students understand that a small p means more evidence to reject the null. I’m also really pleased with my “p-machine”, turning the mean, standard deviation and sample size into a t statistic, which is then converted to a p-value.

There are just a few really big ideas in statistics, and these are some of them. This forms part of statistical literacy, which is important to all citizens. I hope you may find the video useful in helping students remember.

Excel in Statistics and Operations Research

I love spreadsheets

The spreadsheet package is a wondrous thing. It has revolutionized a great many processes in the office, home and scientific research. It has affected the way we think and teach. It has enabled many more people to program and to build models, without even knowing it (and sometimes very badly). And, for better or worse, Excel has become the default spreadsheet package.

I have used Excel to create crosswords

I love spreadsheets and I love Excel.  I first became acquainted with Multiplan and Lotus-123 in 1984 as part of my graduate degree. It was amazing to see how versatile these spreadsheets were. Since then I have also used and taught Visicalc, a couple of shareware spreadsheets and finally Excel. At home I have used Excel to plan holidays, make sense of time zones, organize my decorative egg collection, convert recipes, budget, run the children’s organization at my church, make crosswords and other learning activities, plan a layout for my little sun-room, keep track of my time when juggling home, work and study commitments, cater for functions and schedule work. It is a standing joke at work that if there is a problem I will use a spreadsheet to solve it. And if that fails try duct tape.

Videos about spreadsheets

I have even made YouTube videos about Excel spreadsheets. Our first one, Absolute and Relative References, was made for the class to help with teaching a concept we seemed to have to explain over and over. It came popular worldwide and was the springboard for other spreadsheets about Excel. The video on Linear Programming using Excel Solver uses Lego pieces to illustrate a product mix LP. At an INFORMS conference in San Diego the FrontLine representative voiced his approval of the video, and it has received consistent viewing internationally for many years. It is due for renewal, thanks to upgrades in Solver, but is still helpful for teaching the fundamentals of LP.

Excel and Statistics

There are different schools of thought about using Excel in the teaching of statistics and operations research. My view is that it is preferable to teach Excel in conjunction with another discipline than in a specific computer skills course. Excel plays a big part as a tool in the entry-level Quantitative Methods and Management Science courses taught here at the University of Canterbury.

Statistics is not mathematics, but when it is taught by mathematicians, they are often captured by the mathematical aspects of statistics. Because they understand formulas by looking at them, they think their students will do also. This is fine for a course in mathematical statistics. My experience is that most students of statistics are not in the class from choice, and many find the mathematics daunting. It becomes a barrier to understanding statistics. Using a spreadsheet can help to remove that barrier.

I am aware that the Statistics ToolPak in Excel is far from ideal. It doesn’t include a test for a mean, and the data handling capabilities are poor at best. It can’t deal with missing values, the dialog boxes are unintuitive, and it isn’t even easy to add in when you want to use it. There are no box plots and the histograms are a joke at best and usually incorrect. I would much rather use a dedicated statistics package such as Minitab or SPSS. But there are two over-riding reasons why I continue to have the students learn statistics using Excel. The first reason is that they may never see Minitab or SPSS again in their lives, but Excel or something like it will be in any business they are in. The other reason is that they are learning Excel skills as well – what my colleague charmingly calls CV (resume) expansion. Time and again I have received feedback from students on how the Excel skills have come in handy. Because they use Excel all this time in the course, for all calculations, it becomes a tool, rather than a hurdle.

Graphing in Excel

Excel Graphs are ubiquitous. I wonder if the minion who decided on the default colors of maroon and beige in Excel 2003 smiled when he/she saw the myriad graphs produced in that color scheme. Certainly it was a flag to show whence they came and that the producer of the graph did not know enough to change from the default. Sadly some of the Excel graphs are awful. Tufte would not approve of all the trimming and empty space. Exploding three-dimensional pie-charts should be permanently exploded and abandoned. I realize it is not Excel’s fault that people like really bad graphs. I just wish it wasn’t so good at enabling them. A pie chart is another indicator that the analyst doesn’t really know what they are doing.

Excel makes possible the production of truly awful pie charts.

Despite all that it is fantastic that graphs are easy to produce in Excel. Graphing should be the first step in any data analysis, but manual production of graphs is tedious in the extreme. Excel makes it possible to have a good look at your data, quite quickly, and get an idea for what is happening, in terms of relationships and errors before blundering on with unnecessary analysis on dirty data. There is now no excuse not to graph. And Pivot Charts! These are SO cool. Once I got over the tricky interface and worked out how it all worked, I became a convert. My students are required to produce two-way bar charts and tables using Pivot-Tables and Pivot-Charts. I’m sure they will thank me for it one day. Some of them even do at the time!

Operations Research and Excel

In teaching Operations Research Excel is a boon. We can now escape from trivial Linear programs limited to two decision variables so that they can be plotted on the Cartesian plane. In our Management Science course we use Excel for teaching the idea of a model, for linear programming, multicriteria decision making, discrete event simulation and for financial models using discounting and annuities. There is a transformative opportunity to explore the models and see instantly the effect of a change in input value. Excel is a great introduction to Operations Research, and in an entry-level or MBA paper such as business methods, is sufficient. However any second level paper in OR should be using more appropriate software.

Teachers of decision sciences such as Statistics and Operations Research have the opportunity to teach good spread-sheeting practice, along with their own discipline. We need to be careful that we model best practice with regard to formatting, use of input cells, and avoiding numbers in formulas. This last video is a little silly, because I gave my editor a free-hand. However, the four style rules are still worth promulgating. Enjoy!

Uncertainty, luck and control

Is probability mathematics or statistics?

Probability is often taught alongside statistics, and it is a rather uneasy alliance. Mathematics teachers like probability. It behaves as a good mathematical topic should, and gives nice exact answers. But it also provides the theoretical underpinnings to the rather murkier subjects of statistics, modeling and operations research. An understanding of probability is needed for simulation, queueing theory, inference, decision analysis, project management and inventory control.

Mathematics teachers should love probability as it is a subject that lends itself to hands-on activities and real-life applications. Problem is, the whole uncertain nature of probability means that you can’t guarantee that an experiment will give the results you want.

Sometimes chance activities don’t turn out as we want them to.

For example, one activity I use to teach about sampling variation is to get each member of the class to write down how many hours that week they have spent on the subject. I collect up the pieces of paper, and in front of them take two samples, each of size 5. I ask the class if they expect the means for the two samples to be the same. And they don’t, and usually they aren’t. But one time they were! Now how likely was that? That would be an interesting advanced exercise. I was able to turn it into a learning moment, but I would rather that the means had been different, to illustrate my point.

Probability is about what might happen in the future, and statistics is about what has already happened. Probability is predicting what you might draw out, from knowing what is in the bag, whereas statistics is inferring what is in the bag, from what you drew out. (I would like to attribute this great analogy but can’t remember where I found it – so if anyone can tell me, please do.)

When I teach about probability I introduce four sources of probability estimates: A priori, relative frequency, modeled and subjective. Often students are exposed only to a priori, and get stumped in areas like decision analysis or simulation where other types of probability are needed.

A priori

A priori describes the probability of coins, dice, balls and urns, of people drawing socks out of drawers in the dark and selecting chocolates with their eyes closed. A priori seems rather fantastic, when taken at that level. This is the kind of probability that maths teachers like. The probability of an event happening is the number of ways that event can happen divided by the total number of possible outcomes. You need addition and division, and the ability to turn expressions into words. Casino gambling and lotteries are based on a priori probabilities. There is exactly one chance in 33 that the ball will land on the number 7 in roulette. It is rather nifty really.

Relative frequency

Probability based on relative frequency draws on the past. In the past 20% of people have been late to appointments, so we use 0.2 or 20% as the estimate for how often we expect people to be late in the future. This estimation of probability is only as good as the data collected previously, and the stability of the system.

Modelled

We use models to try to predict outcomes

Weather, avalanches and earthquake events are assigned probabilities using historical data and mathematical models. Forecasting is a major part of business analysis, to try to reduce the uncertainty of the future by modeling with the data from the past. Relative frequency is a very simple form of modeled probability, where the parameter from the past is used unchanged.

Subjective

When we estimate the probability of an occurrence we produce a subjective probability figure. There is a whole branch of research into how people predict, and their systematic biases. Things like the vividness of the outcome, the number of similar occurrences you can think of, and a number given you ahead of time, all affect subjective probabilities. Even language can mislead. The expression, “a fair chance”, can be interpreted as anything from 0.1 to 0.9 by different people, particularly in different cultures.

Culture also affects a person’s views on probability and likelihood. It has been shown that exposure to games of chance affects a person’s understanding of probability. For some cultures there is no probability. Things will either happen or they won’t, depending on the will of God. To attempt to predict is presumptuous.

These considerations are important when teaching about probability and models. Decision trees rely on good probability estimates, so it is vital that students understand the pitfalls of different sources. Even when we know and understand the rules of probability, we fall victim to the mistakes identified by Tversky and Kahnemann.

You may be interested in my other blog, Never Ordinary, about life with an autistic savant, who REALLY doesn’t like uncertainty.

Embrace Change

I love graduations. At the University of Canterbury the academic staff act as marshals,

Dr Nic in PhD regalia - I love to dress up!

helping the graduands to be in the right place at the right time in the right order wearing the right clothes and doing the right things. I have acted as a marshal for some years and love helping people to have a good experience. I love graduations because of the accomplishment they represent, and the efforts the student, the parents and the staff have made for these young people to complete their qualifications. This graduation was pretty special, as it was the class that had to cope with repeated earthquakes, snowfalls and other disruptions in the last half of their degrees. They are the students who had to adapt to being taught in tents or on-line, and who, at the beginning of each exam, were warned what to do in case of an aftershock and told to keep their wallet, phone and keys with them at all times. They are the students who rallied together to support each other and the community and shovel silt.

I also love the ceremony at graduations and dressing up in fancy clothes. I love the music and singing Gaudeamus. I cry during the National Anthem. And I love the speeches, full of hope and encouragement and advice. This graduation Emeritus Professor John Burroughs spoke and made two points. The first was to know yourself and what you can do and what you can’t. He wanted to be an All Black, but couldn’t. But he became a prominent figure in law in New Zealand. Sadly he said he couldn’t do mathematics, which I wish was an admission seen as similar to saying one couldn’t read. Why is it that people think it is okay to be bad at math? Or even something to be proud of? moving on…

The second point Prof Burroughs made is pertinent to the teaching of Statistics and Operations Research. He recalled the advent of the ballpoint pen when he was at school. Until then he had been at the mercy of dip or fountain pens. Then when the ballpoint pen arrived it revolutionized writing. His teachers weren’t impressed and often insisted that students stick with fountain pens so as not to ruin their penmanship. It was an example of technology and improvement and change and people’s reaction. When he was a lecturer in law at the University of Canterbury he eschewed computers and was probably called a Luddite, though not to his face. In his later career he has had to embrace the new technology, including Facebook, twitter, Google and the like (the Like?). And he has enjoyed it. He wishes he had ridden the wave at the time.

There will always be change. Prof Burroughes’s advice to the graduates was to try to anticipate and enjoy change. Change equals opportunity.

And now I get to the point. The widespread use of powerful computers has changed Statistics and Operations Research. What will not change is change. There will continue to be advances in the accessibility of our disciplines to the masses. And we need to embrace this. When I learned Statistics and Operations Research in the early 1980s there was little computing power available. We used Eton tables, and solved two-variable LPs on cartesian planes. We performed matrix operations and stochastic simulation by hand calculation. We learned Revised Simplex by hand. We used the Poisson approximation of the binomial distribution as that avoided tables going too high. When we used MiniSPSS we were allowed ten runs in which to produce a linear regression, and the emphasis was on the production rather than the interpretation of output.

That was then, and this is now, and I think too many teachers of Statistics and Operations Research have not moved on. There is certainly evidence of this in the textbooks. Recently a colleague and I reviewed all first year Operations Research textbooks, examining their treatment of Linear Programming. One of the textbooks was a later edition of one I had used in 1981. The later edition used the same example to teach LP. Much of what was in these textbooks did not recognize the powerful opportunity the spreadsheet provides to explore and understand models.

I have also been reviewing statistics textbooks, though there are too many to be exhaustive. Statistics textbooks too often are stuck in the days of the fountain pen, rather than embracing the great possibilities that are there with the power of the computer.

I challenge all teachers of Operations Research and Statistics to examine what they do and ask if it is the same way that they were taught. If the answer is yes, then some more thinking is called for. We have such amazing opportunities to teach so much better, to use real data, to make a real difference, that to be stuck in the old methods, using tables and formulas is close to a crime.

Stop faking it! Data should be real.

Use real data when teaching statistics

In statistical analysis the context of the data is integral, not a story added on afterwards to make it more interesting. It is not like algebra where “making it real” means  you make up a reason for the equation, and require the students to give the correct units for the answer. In statistics the analysis involves understanding what is happening in the data.

For this reason, as much as possible, data must be real.

In a previous incarnation I have been guilty of making up data. I was even quite proud of being able to make sure my fake multivariate data displayed heteroscedasticity and multicollinearity. That was fine for an assessment item, I reasoned at the time, as I wanted to make sure that students could recognise those effects.

I recently reviewed a case which had been submitted for publication. The case story was great, with some interesting soft aspects, based on a real-life scenario. Then the second part of the case involved analysing data, which was openly fake. I decided to see how I would go, downloaded the data and started playing around in it. I found it disturbing that there was an R-squared value of more than 99%. Then the more I explored, the worse it got, and the more convinced I was that the problem lay in the generation of the data. This would have caused perplexity for students who really wanted to understand what was going on. It is not acceptable to have badly faked data in a case.

 What is so great about real data?

With appropriate topics, the outcome interests the students. It can cause them to think, and realise that there is a use for statistics. It can be exciting! You can have discussions about why this result might have happened.

An interesting bonus, that you can choose to use or not, is that the data is dirty! (See my post about dirty data). Students learn that data does not arrive beautifully sanitised like the pristine textbook sets. They meet with the problems of real data, so they are better prepared for real data in the real world.

The failings of fake data

  1. Effects may seem really interesting, but they were put there by the instructor (sometimes by mistake) so there is no basis in reality. I see this as rather the equivalent of the movie, “the Truman Show”, where a whole world is generated for Truman Burbank with exactly the events needed to make a television series interesting.  Sure you may find a relationship in the data, but only because you put it there in the first place!
  2. You can get odd artefacts of the generation process. Some interesting pattern shows up when a student looks at the data a different way from what you expect. This pattern could be just because you didn’t think to get rid of it.
  3. Generating good fake data is actually quite tricky to do if you want to get it right.
  4. Using fake data trivialises the statistical process to mechanistic algorithm application. Fake data may be better that numeric data with no context, but not by much.

 Sources of real data

The internet abounds with data. We can just about drown in it. This is one source of data, but it is mostly clean, which removes one of the advantages of real data.

However I prefer to get the data from the students themselves. Each year I have a questionnaire which the students fill out anonymously on-line at the start of the course. Then I use this a source of data for use in class examples, exercises and testing. Over the years I have found some interesting effects among the data from our students. An important thing to remember is to make sure you have a range of levels of data. It is very easy to collect nominal/categorical data, but it’s not much use for teaching regression. Paired difference of two means can also be difficult, so you have to think ahead on that one. Here are some example questions for each level of measurement.

Nominal

  • What type of chocolate do you prefer?
  • What kind of mobile phone do you own?
  • Sex?
  • Nationality?
  • How did you travel to university today?
  • What subject are you majoring in?

 Ordinal

  • How useful do you think this course will be in your future career? (Very useful, somewhat useful, not useful)
  • How successful have you been in mathematics in the past? (Very successful, somewhat successful, not successful)
  • How often do you check Facebook? (More than once a day, about once a day, several times a week, about once a week, less often than once a week.)

Interval

  • How many pairs of trousers do you own?
  • What is the most you have ever paid for a pair of trousers.
  • What annual income do you expect to be earning in ten years’ time?
  • What do you think the average income for the class with be in ten years’ time?
  • How many children would you like to have?
  • What is the ideal age to get married?

Real data in Operations Research

Unfortunately it is more difficult to find real-life problems in OR which can be solved in the classroom. One possible approach is to start with a real-life case, and then provide a cut-down version for the students to work on. When we make up exercises for OR, we search the web to make sure that the figures used are realistic estimations of real costs.

In a lesson on Multi-criteria Decision Making we had the case of locating a landfill. This was especially pertinent as our city had recently gone through the political process to set up a new landfill. A helpful website gave ballpark figures on costs for many of the aspects. With the internet at our fingertips there is no excuse for unrealistic figures.

There is work involved in collecting real data, but if we want students to accept that statistics and operations research are relevant, it must be done.

Hey mathematics – leave the stats alone!

Mathematicians love the elegance of mathematics

Mathematicians love mathematics. They love the elegance and the purity and the abstract nature of it all. Consequently they think there is something not quite nice about the practical real life messiness of statistics. Now this is fine, so long as they keep their prejudices away from their students! I recently met a high school maths teacher who was completely vocal about her dislike for statistics. Fortunately she doesn’t teach the final year statistics course, but she can’t avoid the sections of statistics all through the curriculum at lower levels. It hurt me to hear statistics so disliked.

Elementary school-teachers who dislike mathematics harm the good attitude formation in their pupils. They don’t like maths, and they feel uneasy doing it, and that rubs off. High school teachers are often frustrated by the attitudes with which students arrive at high school. There are moves in New Zealand to address this, through the Numeracy Project, which helps to develop skills in our Primary teachers.

What bothers me is similar. Many, if not most, of our high school teachers are pure mathematicians. Some of them allow their dislike for statistics to colour the students’ experience. Or if they don’t actively dislike statistics, they may still feel ill-at-ease, as they did not get enough background knowledge in their training. They may know the mechanisms, but have no experience of statistical analysis. I know this to be true, as I was once one of them. It is difficult to go from an exact subject like mathematics, where you find x and know when you have found it, to an art/science like statistics, where x changes depending on the context.

However I am now a born-again statistics applier. I hesitate to call myself a statistician, as I don’t use R, and I’m not exactly sure what a moment is. But I know how to do statistics in the real world. I know what you should and shouldn’t do with different data, and I know how important context is. I know that you seldom get a simple random sample, and sometimes your sample is so far from random that you blush, but soldier on anyway. I’m skeptical about Factor Analysis. And I keep learning. Every time I do a real statistical analysis I gain insights into the nature of the discipline. And I love it. Statistics is a detective game. The numbers tell a story, and it is up to us to help them reveal their secrets without so much coercion that they tell us lies to make us go away.

My wish is that pure mathematicians in high schools would accept that statistics is not mathematics and never will be. It is a mathematical science, and needs to be taught differently from mathematics.

George W. Cobb and David S. Moore wrote a paper, “Mathematics, Statistics, and Teaching”, which gives answers to questions such as “how does statistical thinking differ from mathematical thinking? and “What is the role of mathematics in statistics?”. They emphasize that beginning statistics should be taught as statistics. A beginning statistics course should use real data and automated production of graphs and analysis.

Statistics lives in the real world

This is antithetical to a pure mathematician. “Remove the maths and the graphing – or get the computer to do it, and where is the maths?”, they cry! “Exactly!”, reply the statistics teachers.

I hope there are maths (or math) teachers reading this. You can do it – you just need to accept that statistics is NOT mathematics, and learn to see the rigour and excitement in it. Embrace the messiness! Throw off the shackles of finding the one correct answer! Statistics, well-taught, will be more use to most of your students than calculus.

Statistics Textbooks suck out all the fun.

Do the textbook writers like the students?

In 1987 George Cobb published a paper evaluating statistics textbooks. I am very grateful for it, as it alerted me to the problems with textbooks, and introduced me to the man himself, whose work I greatly admire. Cobb explains that statistics is an inherently interesting and practical subject, but that many textbooks seem to have missed that, or concealed it from the students.

The discipline of statistics is inherently fascinating, applied and important. So why do so many textbooks make it seem mechanistic and abstract? I have been examining textbooks, and wonder if the writers even like their subject matter, or the students they are supposed to be reaching.

I am particularly interested in textbooks for non-mathematicians. The majority of students of statistics are not mathematicians, and are not planning to take any more statistics than they are required to. These students don’t like mathematics. They feel uneasy about taking the course. They are required to take a statistics course as part of their business, psychology or health sciences major. They aren’t even sure why they need to take the course, and hope to get it over and done with and forget about the experience as soon as possible. A previous post talks about how to help students who are feeling negatively towards the course. A textbook for these students needs to get the tone and content right.

Tone

A friendly, but authoritative tone is important. Some go too far and become corny in their chattiness. It’s nice to be friendly, but it can be a bit tiresome and the examples can be too cute. But most are just too dry – and have too many words. And far too many equations and algorithms. They seemed bent on protectionism rather than empowerment.

Content

Even more important is the choice of content, and I find this fascinating. I wonder what course some textbooks are designed for. A telling chapter is regression. Regression is an important statistical technique. But what do we tell them about regression? Here is how I have recently seen it done. Provide an example of real data taken from the web. Introduce the problem, then let them wait until the end to find out where you are going. Give the mathematical way of expressing a line, using greek letters. Derive the least squares method of line fitting. Calculate the line by hand. Interpret the slope and the intercept. Calculate the coefficient of determination by hand. Interpret it. Define the residuals, and calculate them. Calculate the F-statistic and t-statistics. Interpret them. Then finish off the story you started at the beginning of the chapter (not that anyone cares anymore).

Some of you may be wondering what is wrong with that. Good – it means I am not preaching to the choir.

Students need to see the whole picture from the beginning. If you absolutely MUST do the mathematics, put it at the end of the chapter for the keen students, but don’t do the maths in the body of the text and scare the others. Do not assume the readers know how to interpret a line. Most don’t. Start with some examples that explain the context, show the line, and explain and apply the model equation. Next work through one example thoroughly, using computer output. Explain the different values and talk about what applies to the sample, and what helps us to generalize to the population. Then provide some more examples, making sure many of them are not statistically significant, some have negative slopes, and all are solving a problem using a sufficiently large sample of real data. Then give them a template for writing up a regression, explaining the different parts. Finally, if you must, you can give them the mathematics. This may keep the instructors happy so that they will buy your book.

There are differing views on finding the mean for ordinal data.

Another telling bit of content is a textbook’s approach to ordinal data. In my video about types of data two instructors argue over whether it is permissible to calculate the mean for ordinal data. It ends with them calling each other “nit-picking mathematician” and “sloppy social scientist”. My approach is to take the middle ground. It is not ideal mathematically to calculate a mean for ordinal data, but much of the time people do, so it is best to know why it may cause problems and that there is an issue, rather than pretending that it never happens. Look in the textbook. I would be wary of any text that states categorically that you cannot find the mean for ordinal data.

There is also the issue of the purpose of the text, both its place in the course, and in the lives of the students. Textbooks can take different roles in courses, largely as a function of the confidence and competence of the instructor. A novice instructor, unsure of the material is well-advised to stick closely to the textbook. But an experienced and engaged instructor will find the text less and less important and more a peripheral second opinion and source of homework exercises. The internet and Wikipedia have replaced the textbook as the source of background knowledge. We suspect a textbook is used more as an expensive combination of talisman and doorstop by the students.

“Judge a book by its exercises and you cannot go far wrong,”  said George Cobb. All exercises in statistics should have context. There is no place for fitting a line by hand calculation to a set of five points with no context. Leave that to mathematics courses. Statistics is about context, and all examples need to reflect that. The data should be real data, so that an interesting result is authentic, not just something dreamed up by the instructor. The data should occasionally be dirty even! (but not too early in the course, without warning). And there should be enough data. Don’t perpetuate bad habits by using too few data.

Having said all this, I do wonder what the role of textbooks is in the education of the future. On-line materials, which can be frequently updated, and crowd-sourced explanations such as found on Wikipedia and elsewhere can fill the place of a textbook.

Or there is always our app – AtMyPace: statistics, which uses video and interactive lessons to teach some important concepts. We are now working to bring this to the web so all can use it. And then maybe I should write a textbook. ;)

Anxiety, fear and antipathy for maths, stats and OR

I love mathematical subjects. I love statistics and teaching statistics, and I love Operations Research and teaching Operations Research. But I do not represent the majority of people in the world and I definitely do not represent the majority of my students who come into my courses.

Many people don't really like mathematics.

People take my courses because they are required to. They don’t really want to do statistics and quantitative methods. However, by the end of the course, many have discovered, to their joy, that they CAN do maths, and actually enjoy it. It is empowering for them and wonderful for me. Emotional students have told me how the course has changed the way they see themselves and mathematical subjects. One young woman had previously failed two traditional statistics courses. However after passing our course, she went on to further stats courses, and eventually had a marketing internship involving data analysis, worked as a tutor on our course and completed a postgraduate degree. This is the letter she sent me after passing the course:

“I just thought I would let you know that I have really enjoyed this course, considering I hated maths this is not to be taken lightly! I was told it would be a good course for me to take but was slightly sceptical. However, I think being able to continuously see your progress and results gave me a lot of motivation and a great sense of achievement.

People enjoy succeeding.

“The tutorials were also fantastic, the tutors were always friendly and very helpful and a lot of credit must go to them. Obviously without these tutorials I would not have passed the course.

“Thank you for offering a course that has enabled me to understand and even at times enjoy stats!”

Ideas for helping students overcome antipathy towards mathematical things.

Perserverance

One difference I have noticed between people who succeed at maths and those who don’t, is what they do when they get something wrong. When I do a problem and get a wrong answer, I do not see it as a personal failure, but try again. However people who are less secure in their ability to do maths get upset at each wrong answer, and give up easily so that they can avoid further failure.They seem to take mistakes personally.

We do two things in our course to help with this. We have a large bank of problems for students to try over and over again, and tell them explicitly that we expect them to do the practice test at least seven times before they are ready for the supervised test. This way they see the failure as part of the process, rather than as a reflection of their own inadequacy. Secondly we begin our course with quite easy material and build up to more difficult. This way they start to experience success, and learn that the key to passing is putting in more time.

Relevance

Another issue is that people who dislike mathematics, often do so because it feels irrelevant and a waste of time. Sometimes this is an excuse, but, tied in with the first reason, it is easy to see that people will not spend a lot of time dong something that makes them feel like a failure, and for which they cannot see the purpose. So another thing we do in our course is make sure that every single example is there because it is useful for them, and has a real world application to which they can relate, or at least which they can see is important. For example, our questions on Binomial distribution are based on marketing, human resources, and retailing examples. Our analysis is done on data collected from the students themselves. In our Operations Research course we get them to work out a MCDM scenario related to what they will do at the end of the course. Statistics and Operations Research are inherently interesting and practical, so it should not be difficult to keep them that way.

Borrowed self-efficacy

Self-efficacy is the belief a person has in how well they can accomplish a task. Studies into effective learning have found that the level of self-efficacy a person has regarding a certain subject or learning in general is a good predictor of how well they will do. You could say that it may be quite realistic – that they believe they can do it because they can. But studies have controlled for that and the effect is still there. Now we can’t inject people with self-efficacy, but we can lend it to them. Self-efficacy can be borrowed from the instructor or the course. We tell them that this is a course for people who have previously found maths difficult. We tell them how successful other people like them have been in the course. We tell them how well it is designed and how much we are willing to help them to succeed in learning the material. Students feel this encouragement and take heart from it.

Love

People don’t care how much you know until they know how much you care. I believe you need to love the students. I’m talking here about genuine, respectful love for other human beings. We need to care about them as people, not just students in our classes. We need to love our subject and believe intrinsically in it. This shines through, even when we don’t give face-to-face lectures. One student told me he knew I was a good teacher because I was so thorough. I am thorough because I love the students and want them to succeed.

Love is not a word used often in secular higher-education. However I have been privileged to see many great teachers, whose whole approach was centered in love for the students.

These are my ways to help students who are anxious, fearful or less than keen to be taking my course.  I teach them to see failure as a step to success, build the material up in small steps, make it real, help them develop self-efficacy, and let them feel how much I care. The rewards are so worth the effort!

Operations Research and Statistics: BFF

As they say on Twitter: That silence after you tell someone you teach Operations Research (or Statistics).

Those in the OR and Statistics communities know what conversation stoppers our disciplines are. When asked what subject I teach I take a punt and respond with “Operations Research”, “Management Science” or “Statistics”. “Operations Research” is met with incomprehension, “Management Science” with miscomprehension, and “Statistics” with thinly disguised antipathy. Apart from being undervalued, what the disciplines have in common is that we do practical stuff with numbers. The pedagogies of these disciplines have much in common.

Operations Research and Management Science (which for many people are synonymous) use statistics and other mathematical analysis techniques to solve real world problems.

Operations Research/Management Science is a discipline which seeks to improve a problem situation by supplying decision makers with information and insights gained through problem analysis, often involving mathematical modelling. (Nicola Ward Petty)

A knowledge of probability and statistics forms part of the OR/MS toolkit, along with linear and non-linear programming, decision analysis, queueing, simulation, heuristics, multicriteria decision-making and operations management tools such as Critical path and inventory control. OR/MS is more than just a set of tools, however, and includes a philosophy of improvement through modelling (as stated in the definition).

Statistics focusses on extracting information from data, and provides the backbone to research in just about all human endeavours, including physics, astronomy, medicine, business, education, psychology, sport and agriculture. Statistical analysis is an essential part of the scientific method. It is often used to inform decision-making, as is OR/MS

I think it is fair to categorise both Statistics and OR/MS as decision sciences, and mathematical sciences. It is also fair to say that the average person in the street has little comprehension of either of them.

So how does this affect teachers of these disciplines? There has been considerable research into the teaching of statistics, and much less into the teaching of operations research (probably because of the number of students taking each of the subjects.) Volumes such as “The challenge of developing statistical literacy, reasoning and thinking”(2004), and “Developing students’ statistical reasoning: connecting research and teaching practice”, (2008) edited by Dani Ben-Zvi and Joan Garfield provide inspiration and guidance to statistics teachers and educational researchers.

The statistics education research literature accepts as given that there are challenges in teaching quantitative courses. Ben-Zvi & Garfield (2004) state four main challenges to success in teaching and learning statistics, which must resonate with many OR instructors. These can be paraphrased as: It can be hard to motivate students to do hard work. Many students have difficulty with the underlying mathematics, and that interferes with learning the related content. The context can mislead students who rely on experience and intuition, and students expect the focus to be on numbers, computations, formulas and one right answer. This can be summarised as

• motivation to work
• mathematical
• contexts
• inexact

Motivation to work

We accept that Statistics and OR/MS are not as inherently motivating to the majority of our students as we would like. Part of our brief is to help them understand how important the subject can be, which can be done through the use of real world examples, and preferably real-world data. What is also motivating to most humans is learning for its own sake. If students feel the joy of passing from incomprehension to comprehension to mastery, this is deeply motivating. Experiences which lead to successful learning aid student motivation.

Mathematical

It’s true. We use mathematics. But we are not mathematics. And when we can get the same result by avoiding the mathematics, all power to us! No one should calculate standard deviations or solve linear programs by hand any more. The ubiquitous spreadsheet has removed that necessity in all but trivial and explanatory examples. It is increasingly possible to do plausible analysis relying totally on computer packages. This way we can give students non-trivial exercises using real data. A little aside here – I personally found mathematics unappealing when there ceased to be numbers in it other than as subscripts, and promptly switched majors to Operations Research, where I my love of numbers and practical problem-solving was indulged.

Contexts

One of my heroes, George Cobb, points out that statistics does not exist without the context. I would suggest the same is true of OR/MS. Remove the application area and we are back in mathematics or math programming. Context can be given in a mathematical example as an unnecessary little story to give pseudo-reality to problems that are inherently abstract. Now there is nothing wrong with abstract – it’s just that statistics and OR/MS aren’t abstract. All problems in statistics and OR/MS should have a context which is relevant and forms part of the answer to the question. Questions like “Find the expected value for the following (context-free) discrete distribution” are to be avoided. Why would we want to know the expected value? What do we do with it when we have got it? Statistics and OR need to answer questions. However, the context can also become a stumbling block when students construct incorrect knowledge based on generalisations of contexts, or allow their own intuition to over-ride what the analysis is telling them.

Inexact

What I used to love about mathematics as a child was getting lots of red ticks (checkmarks not irritating insects) down beside my work. My son once gave me a handmade birthday card covered in red ticks as he knew how much I liked them. In mathematics there was one correct answer. (See my post on Re:Solutions). You had to find x, and when you found it you knew you had the right one. This is SO not true in statistics and Operations Research. Everything “depends”. I now embrace the ambiguity, whereas it felt distinctly uncomfortable at first.

By articulating these four challenges we are better equipped to face them. Let’s try to make our subjects motivating and doable, mathematically appropriate to the audience and with interesting contexts that embrace ambiguity. In this way we can better teach the Science of Better and the Science of Data.