In 1987 George Cobb published a paper evaluating statistics textbooks. I am very grateful for it, as it alerted me to the problems with textbooks, and introduced me to the man himself, whose work I greatly admire. Cobb explains that statistics is an inherently interesting and practical subject, but that many textbooks seem to have missed that, or concealed it from the students.
The discipline of statistics is inherently fascinating, applied and important. So why do so many textbooks make it seem mechanistic and abstract? I have been examining textbooks, and wonder if the writers even like their subject matter, or the students they are supposed to be reaching.
I am particularly interested in textbooks for non-mathematicians. The majority of students of statistics are not mathematicians, and are not planning to take any more statistics than they are required to. These students don’t like mathematics. They feel uneasy about taking the course. They are required to take a statistics course as part of their business, psychology or health sciences major. They aren’t even sure why they need to take the course, and hope to get it over and done with and forget about the experience as soon as possible. A previous post talks about how to help students who are feeling negatively towards the course. A textbook for these students needs to get the tone and content right.
A friendly, but authoritative tone is important. Some go too far and become corny in their chattiness. It’s nice to be friendly, but it can be a bit tiresome and the examples can be too cute. But most are just too dry – and have too many words. And far too many equations and algorithms. They seemed bent on protectionism rather than empowerment.
Even more important is the choice of content, and I find this fascinating. I wonder what course some textbooks are designed for. A telling chapter is regression. Regression is an important statistical technique. But what do we tell them about regression? Here is how I have recently seen it done. Provide an example of real data taken from the web. Introduce the problem, then let them wait until the end to find out where you are going. Give the mathematical way of expressing a line, using greek letters. Derive the least squares method of line fitting. Calculate the line by hand. Interpret the slope and the intercept. Calculate the coefficient of determination by hand. Interpret it. Define the residuals, and calculate them. Calculate the F-statistic and t-statistics. Interpret them. Then finish off the story you started at the beginning of the chapter (not that anyone cares anymore).
Some of you may be wondering what is wrong with that. Good – it means I am not preaching to the choir.
Students need to see the whole picture from the beginning. If you absolutely MUST do the mathematics, put it at the end of the chapter for the keen students, but don’t do the maths in the body of the text and scare the others. Do not assume the readers know how to interpret a line. Most don’t. Start with some examples that explain the context, show the line, and explain and apply the model equation. Next work through one example thoroughly, using computer output. Explain the different values and talk about what applies to the sample, and what helps us to generalize to the population. Then provide some more examples, making sure many of them are not statistically significant, some have negative slopes, and all are solving a problem using a sufficiently large sample of real data. Then give them a template for writing up a regression, explaining the different parts. Finally, if you must, you can give them the mathematics. This may keep the instructors happy so that they will buy your book.
Another telling bit of content is a textbook’s approach to ordinal data. In my video about types of data two instructors argue over whether it is permissible to calculate the mean for ordinal data. It ends with them calling each other “nit-picking mathematician” and “sloppy social scientist”. My approach is to take the middle ground. It is not ideal mathematically to calculate a mean for ordinal data, but much of the time people do, so it is best to know why it may cause problems and that there is an issue, rather than pretending that it never happens. Look in the textbook. I would be wary of any text that states categorically that you cannot find the mean for ordinal data.
There is also the issue of the purpose of the text, both its place in the course, and in the lives of the students. Textbooks can take different roles in courses, largely as a function of the confidence and competence of the instructor. A novice instructor, unsure of the material is well-advised to stick closely to the textbook. But an experienced and engaged instructor will find the text less and less important and more a peripheral second opinion and source of homework exercises. The internet and Wikipedia have replaced the textbook as the source of background knowledge. We suspect a textbook is used more as an expensive combination of talisman and doorstop by the students.
“Judge a book by its exercises and you cannot go far wrong,” said George Cobb. All exercises in statistics should have context. There is no place for fitting a line by hand calculation to a set of five points with no context. Leave that to mathematics courses. Statistics is about context, and all examples need to reflect that. The data should be real data, so that an interesting result is authentic, not just something dreamed up by the instructor. The data should occasionally be dirty even! (but not too early in the course, without warning). And there should be enough data. Don’t perpetuate bad habits by using too few data.
Having said all this, I do wonder what the role of textbooks is in the education of the future. On-line materials, which can be frequently updated, and crowd-sourced explanations such as found on Wikipedia and elsewhere can fill the place of a textbook.
Or there is always our app – AtMyPace: statistics, which uses video and interactive lessons to teach some important concepts. We are now working to bring this to the web so all can use it. And then maybe I should write a textbook.