Do review sheets help?

A lot of what I do as a college instructor draws upon the accumulated wisdom and practice of my profession, plus my personal experience. I accumulate ideas and strategies from mentors and colleagues, I read about pedagogy, I try to get a feel for what works and what doesn’t in my classes, and I ask my students what is working for them. That’s what I suspect that most of us do, and probably it works pretty well.

But as stats guru and blogger Andrew Gelman pointed out not too long ago, we don’t often formally test which of our practices work. Hopefully the accumulated wisdom is valid — but if you’re a social scientist, your training might make you want something stronger than that. In that spirit, recently I ran a few numbers on a pedagogical practice that I’ve always wondered about. Do review sheets help students prepare for tests?


When I first started teaching undergrad courses, I did not make review sheets for my students. I didn’t think they were particularly useful. I decided that I would rather focus my time and energy on doing things for my students that I believed would actually help them learn.

Why didn’t I think a review sheet would be useful? There are 2 ways to make a review sheet for an exam. Method #1 involves listing the important topics, terms, concepts, etc. that students should study. The review sheet isn’t something you study on its own — it’s like a guide or checklist that tells you what to study. That seemed questionable to me. It’s essentially an outline of the lectures and textbook — pull out the headings, stick in the boldface terms, and voila! Review sheet. If anything, I thought, students are better off doing that themselves. (Many resources on study skills tell students to scan and outline before they start reading.) In fact, the first time I taught my big Intro course, I put the students into groups and had them make their own review sheets. Students were not enthusiastic about that.

Method #2 involves making a document that actually contains studyable information on its own. That makes sense in a course where there are a few critical nuggets of knowledge that everybody should know — like maybe some key formulas in a math class that students need to memorize. But that doesn’t really apply to most of the courses I teach, where students need to broadly understand the lectures and readings, make connections, apply concepts, etc. (As a result, this analysis doesn’t really apply to courses that use that kind of approach.)

So in my early days of teaching, I gave out no review sheets. But boy, did I get protests. My students really, really wanted a review sheet. So a couple years ago I finally started making list-of-topics review sheets and passing them out before exams. I got a lot of positive feedback — students told me that they really helped.

Generally speaking, I trust students to tell me what works for them. But in this case, I’ve held on to some nagging doubts. So recently I decided to collect a little data. It’s not a randomized experiment, but even some correlational data might be informative.


In Blackboard, the course website management system we use at my school, you can turn on tracking for items that you post. Students have to be logged in to the Blackboard system to access the course website, and if you turn on tracking, it’ll tell you when (if ever) each student clicked on a particular item. So for my latest midterm, the second one of the term, I decided to turn on tracking for the review sheet so that I could find out who downloaded it. Then I linked that data to the test scores.

I posted the review sheet on a Monday, 1 week before the exam. The major distinction I drew was between people who downloaded the sheet and those who never did. But I also tracked when students downloaded it. There were optional review sessions on Thursday and Friday. Students were told that if they came to the review session, they should come prepared. (It was a Jeopardy-style quiz.) So I divided students into several subgroups: those who first downloaded the sheet early in the week (before the review sessions), those who downloaded it on Thursday or Friday, and those who waited until the weekend before they downloaded it. I have no record of who actually attended the review sessions.

A quick caveat: It is possible that a few students could’ve gotten the review sheet some other way, like by having a friend in the class print it for them. But it’s probably reasonable to assume that wasn’t widespread. More plausible is that some people might have downloaded the review sheet but never really used it, which I have no way of knowing about.


Okay, so what did I find? First, out of N=327 students, 225 downloaded the review sheet at some point. Most of them (173) waited until the last minute and didn’t download it until the weekend before the exam. 17 downloaded it Thursday-Friday, and 35 downloaded it early in the week. So apparently most students thought the review sheet might help.

Did students who downloaded the review sheet do any better? Nope. Zip, zilch, nada. The correlation between getting the review sheet and exam scores was virtually nil, r = -.04, p = .42. Here’s a plot, further broken down into the subgroups:

Review Sheet 1

This correlational analysis has potential confounds. Students were not randomly assigned — they decided for themselves whether to download the review sheet. So those who downloaded it might have been systematically different from those who did not; and if they differed in some way that would affect their performance on the second midterm, that could’ve confounded the results. In particular, perhaps the students who were already doing well in the class didn’t bother to download the review sheet, but the students who were doing more poorly downloaded it, and the review sheet helped them close the gap. If that happened, you’d observe a zero correlation. (Psychometricians call this a suppressor effect.)

So to address that possibility, I ran a regression in which I controlled for scores on the first midterm. The simple correlation asks: did students who downloaded the review sheet do better than students who didn’t? The regression asks: did students who downloaded the review sheet do better than students who performed just as well on the first midterm but didn’t download the sheet? If there was a suppressor effect, controlling for prior performance should reveal the effect of the review sheet.

But that isn’t what happened. The two midterms were pretty strongly correlated, r = .63. But controlling for prior performance made no difference — the review sheet still had no effect. The standardized beta was .00, p = .90. Here’s a plot to illustrate the regression: this time, the y-axis is the residual (the difference between somebody’s actual score minus the score we would have expected them to get based on the first midterm):

Review Sheet 2Limitations

This was not a highly controlled study. As I mentioned earlier, I have no way of knowing whether students who downloaded the review sheet actually used it. I also don’t know who used a review sheet for the first midterm, the one that I controlled for. (I didn’t think to turn on tracking at the start of the term.) And there could be other factors I didn’t account for.

A better way to do this would be to run a true experiment. If I was going to do this right, I’d go into a class where the instructor isn’t planning to give out review sheets. Tell students that if they enroll in the experiment, they’ll be randomly assigned to get different materials to help them prepare for the test. Then you give a random half of them a review sheet and tell them to use it. For both ethical and practical reasons, you would probably want to tell everybody in advance that you’ll adjust scores so that if there is an effect, students who didn’t get the sheet (either because they were in the control group or because they chose not to participate) won’t be at a disadvantage. You’d have to be careful in what you tell them about the experiment to balance informed consent without creating demand characteristics. But it could probably be done.


In spite of these issues, I think this data is strongly suggestive. The most obvious confounding factor was prior performance, which I was able to control for. If some of the students who downloaded the review sheet didn’t use it, that would attenuate the difference, but it shouldn’t make it go away entirely. To me, the most plausible explanation left standing is that review sheets don’t make a difference.

If that’s true, why do students ask for review sheets and why do they think that they help? As a student, you only have a limited capacity to gauge what really makes a difference for you — because on any given test, you will never know how well you would have done if you had studied differently. (By “limited capacity,” I don’t mean that students are dumb — I mean that there’s a fundamental barrier.) So a lot of what students do is rely on feelings. Do I feel comfortable with the material? Do I feel like I know it? Do I feel ready for the exam? And I suspect that review sheets offer students an illusory feeling of control and mastery. “Okay, I’ve got this thing that’s gonna help me. I feel better already.” So students become convinced that they make a difference, and then they insist on them.

I also suspect, by the way, that lots of other things work that way. To date, I have steadfastly refused to give out my lecture slides before the lecture. Taking notes in your own words (not rote) requires you to be intellectually engaged with the material. Following along on a printout might feel more relaxed, but I doubt it’s better for learning. Maybe I’ll test that one next time…

Students, fellow teachers, and anybody else: I’d welcome your thoughts and feedback, both pro and con, in the comments section. Thanks!

Thinking hard

I’ve been enjoying William Cleveland’s The Elements of Graphing Data, a book I wish I’d discovered years ago. The following sentence jumped out at me:

No complete prescription can be designed to allow us to proceed mechanically and to relieve us of thinking hard. (p. 59)

The context was — well, it doesn’t matter what the context was. It’s a great encapsulation of what statistical teaching, mentoring, and consulting should be (teaching how to think hard) and cannot be (mechanical prescriptions).

Teaching is a social interaction

Howard Gardner suggests that the next big leap for teaching will be “personalized education,” in which people will learn from computers that adapt to their individual learning style:

Well-programmed computers—whether in the form of personal computers or hand-held devices—are becoming the vehicles of choice. They will offer many ways to master materials. Students (or their teachers, parents, or coaches) will choose the optimal ways of presenting the materials. Appropriate tools for assessment will be implemented. And best of all, computers are infinitely patient and flexible. If a promising approach does not work the first time, it can be repeated, and if it continues to fail, other options will be readily available.

My response to this is a big fat humbug. Gardner has put forward some interesting ideas about multiple intelligences and different learning styles. But the notion that computers will supplant human teachers strikes me as overreaching.

Teaching is, at its core, a social interaction between teacher and student. That is why MIT isn’t putting itself out of business by putting gobs of course materials online. Teachers do not create new information. (Or at least — if they’re at a university and also do research — not in their role as teachers.) And frankly, they don’t often package it into some novel format (“here is a bodily-kinesthetic presentation of Bayes’ Theorem”). What teachers do is convey information through a social interaction with their students. Perhaps some day we’ll know enough about how to turn computers into compelling social agents that can reproduce that experience. But until then, I’m not worried about technology supplanting human teachers.