When you have an interaction, which variable moderates which?

I was talking recently with a colleague about interpreting moderator effects, and the question came up: when you have a 2-way interaction between A and B, how do you decide whether to say that A moderates B versus B moderates A?

Mathematically, of course, A*B = B*A, so the underlying math is indifferent. I was schooled in the Baron and Kenny approach to moderation and mediation. I’ve never found any hard and fast rules in any of Kenny’s writing on the subject (if I’ve missed any, please let me know in the comments section). B&K talk about the moderator moderating the “focal” variable, and I’ve always taken that to be an interpretive choice by the researcher. If the researcher’s primary goal is to understand how A affects Y, and in the researcher’s mind B is some other interesting variable across which the A->Y relationship might vary, then B is the moderator. And vice versa. And to me, it’s entirely legitimate to talk about the same analysis in different ways — it’s a framing issue rather than a deep substantive issue.

However, my colleague has been trying to apply Kraemer et al.’s “MacArthur framework” and has been running into some problems. One of the MacArthur rules is that the variable you call the moderator (M) is the one that comes first, since (in their framework) the moderator always temporally precedes the treatment (T). But in my colleague’s study the ordering is not clear. (I believe that in my colleague’s study, the variables in question meet all of Kraemer’s other criteria for moderation — e.g., they’re uncorrelated — but they were measured at the same timepoint in a longitudinal study. Theoretically it’s not clear which one “would have” come first. Does it come down to which one came first in the questionnaire packet?)

I’ll admit that I’ve looked at Kraemer et al.’s writing on mediation/moderation a few times and it’s never quite resonated with me — they’re trying to make hard-and-fast rules for choosing between what, to me, seem like 2 legitimate alternative interpretations. (I also don’t really grok their argument that a significant interaction can sometimes be interpreted as mediation — unless it’s “mediated moderation” in Kenny-speak — but that’s a separate issue.) I’m curious how others deal with this issue…

When NOT to run a randomized experiment

Just came across a provocative article about the iatrogenic effects of self-help cognitive-behavioral therapy (CBT) books:

Self-help books based on the traditional principles of CBT, including popular titles like ‘CBT for Dummies’, can do more harm than good, according to a new study. The risks were highest for readers described as ‘high ruminators’ – those who spend time mulling over the likely causes and consequence of their negative moods.

The gist of the research (by Gerald Haeffel and colleagues) is that in some people’s hands — specifically, people prone to engage in rumination — self-guided CBT techniques can exacerbate depressive symptoms. In CBT, clients are often taught to pay attention to their negative thoughts so they can recognize and change them. But ruminators are already excessively focused on negative thoughts, which is why they are at higher risk for depression. Just following a book without the help of a dedicated therapist, ruminators may be encouraged to ruminate even more, without acquiring the skills to take the next step of challenging and altering those thought patterns.

What’s interesting from a research-design perspective is that this finding comes from a study that crossed a randomized manipulation (giving people traditional CBT self-help books vs. 2 control conditions) with a person variable (individual differences in a proneness to rumination) and found a meaningful statistical interaction. As such, it is able to identify a causal process that is stronger within a subset of the population.

What this design doesn’t tell us, though, is about the real-world effects. Experimental randomization means that high and low ruminators were equally likely to get the CBT books. In the real world we cannot assume this would be the case. If ruminators are more likely than non-ruminators to seek out these kinds of books — maybe they seek out books that are compatible with their existing cognitive tendencies — then the problem would be even worse than the experiment suggests. On the other hand, if ruminators are less likely to seek out CBT-based self-help books (maybe recognizing that the advice inside isn’t going to help them), then self-selection would mitigate the real-world effects.

So a useful followup study to complement this work would be an observational design, in which high- and low-ruminators were allowed to select among books with and without the harmful CBT components, and you could model whether such self-selection mediates effects on depressive symptoms.

Rethinking intro to psych

Inside Higher Ed has a really interesting article, Rethinking Science Education, about how some universities are trying to break the mold of the traditional intro-to-a-science course. From the article:

Too many college students are introduced to science through survey courses that consist of facts “often taught as a laundry list and from a historical perspective without much effort to explain their relevance to modern problems.” Only science students with “the persistence of Sisyphus and the patience of Job” will reach the point where they can engage in the kind of science that excited them in the first place, she said.

This is exactly how Intro to Psych is taught pretty much everywhere — as a laundry list of topics and findings, usually old ones. The scientific method is presented didactically as another topic in the list (usually the first one), rather than being woven into the daily experience of the class.

It’s a problem that’s easy to point out, but hard to solve. You almost couldn’t do it as a single instructor working within a traditional curriculum. Our majors take a 4-course sequence: 2 terms of intro, then statistics, then research methods. You’d essentially need to flip that around — start with a course called “The Process of Scientific Discovery in Psychology” and have students start collecting and analyzing data before they’ve even learned most of the traditional Intro topics. Such an approach is described in the article:

One approach to breaking out of this pattern, she said, is to create seminars in which first-year students dive right into science — without spending years memorizing facts. She described a seminar — “The Role of Asymmetry in Development” — that she led for Princeton freshmen in her pre-presidential days.

She started the seminar by asking students “one of the most fundamental questions in developmental biology: how can you create asymmetry in a fertilized egg or a stem cell so that after a single cell division you have two daughter cells that are different from one another?” Students had to discuss their ideas without consulting texts or other sources. Tilghman said that students can in fact engage in such discussions and that in the process, they learn that they can “invent hypotheses themselves.”

Would this work in psychology? I honestly don’t know. One of the big challenges in learning psychology — which generally isn’t an issue for biology or physics or chemistry — is the curse of prior knowledge. Students come to the class with an entire lifetime’s worth of naive theories about human behavior. Intro students wouldn’t invent hypotheses out of nowhere — they’d almost certainly recapitulate cultural wisdom, introspective projections, stereotypes, etc. Maybe that would be a problem. Or maybe it would be a tremendous benefit — what better way to start off learning psychology than to have some of your preconceptions shattered by data that you’ve collected yourself?

Do learning styles really exist? Pashler et al. say no

Do different people have different learning styles? It has become almost an article of faith among educators and students that the answer is yes, in large part due to the work of Howard Gardner (who recently went so far as to suggest that computerized assessment of learning styles may someday render traditional classroom teaching obsolete).

But a new review by Hal Pashler and colleagues suggests otherwise. They find ample evidence that people believe they have different learning styles — but almost no evidence that such styles actually exist.

When I first encountered Gardner’s theory of multiple intelligences as an undergrad, I found it fascinating. But I’ll admit that the more I teach, the more I’ve become skeptical when people invoke it. In principle it could lead to an optimistic, proactive attitude about learning: if a student isn’t making progress, let’s try teaching and learning in another modality. But in my experience, people invoke learning styles to almost the opposite effect. “I [or you] have a different learning style” has 2 problems with it. One, it’s an attributional “out” for somebody who isn’t doing well in class — it’s kind of a socially acceptable way of excusing poor performance by both teacher and student. And two, it’s an entity-theorist explanation (in the Carol Dweck sense) that can lead students to disengage from a class.

But skepticism about how people invoke it isn’t as deep as skepticism about the very existence of the phenomenon, which is where Pashler et al. are aiming. They acknowledge something well known among intelligence researchers, that there are subdomains of intellectual ability — e.g., in comparing two people with the same general IQ, one might be better at verbal tasks and the other better at visual-spatial tasks. But that’s about ability — Person A is better at one thing and Person B is better at another. Learning styles suggest that Persons A and B could both be good at the same thing if it was only presented to each in a custom-tailored way. Pashler et al. call this the “meshing hypothesis” and they say that well-designed, controlled studies find no support for it.

I don’t think this is the death-knell for multimodal teaching. When I teach statistics, I try to present each concept in as many modes as possible — a verbally narrated explanation, a visual depiction, a formal-symbolic representation (i.e., words, pictures, and equations). I still think that is a good way to teach. But the surviving rationale is that any one student will benefit from seeing the same underlying concept represented 3 different ways — not because the 3 modalities will reach 3 different kinds of students.

Of course, I’m sure this won’t be the last word. I expect there will be a vigorous response from Gardner and others. Stay tuned.

UPDATE: In re-reading this post, I realized I should probably clarify my references to Gardner. Gardner’s theory of multiple intelligences is centrally about abilities, not learning styles; in that sense, it is not directly challenged by this research. However, I think Gardner is relevant for two reasons. One, I think a lot of people who discuss learning styles look to him as a role model and a leader. Multiple intelligences is often mentioned in conjunction with learning styles, and they both fall under a larger umbrella of proposing that we need to respect and work around cognitive diversity. Two, Gardner himself has discussed the idea that different students learn in different ways — not just that different people are good at different things. So even though MI theory is more about abilities, I think Gardner is an important influence on a set of related ideas.

Happy anniversary

Blogging has been kind of slow due to the birth of my son 2 weeks ago. I’ll be back at it soon, I promise. In the meanwhile, let me briefly pause to wish a happy anniversary to On the Origin of Species, which was published 150 years ago today. I recommend that you celebrate by buying yourself a t-shirt.

A student’s perspective on PowerPoint lectures

A student blogger who goes by Carolyn Blogs has an interesting entry on PowerPoint lectures from the perspective of someone taking the class:

Recently I came to the conclusion that I do not learn well from classes in which the lectures are based on PowerPoint presentations… Professors who use PowerPoint tend to present topics very quickly when they don’t have to do anything but talk. If every example and every diagram is on the screen, there isn’t much time for me to take notes on the subject of each slide. Lectures aided by chalkboard visuals are easier to take notes from because I can write what the professor writes on the board at the same time. Also, because there is usually more chalkboard space than screen space, if I am behind on note-taking, the visual will probably still be on the board for me to copy a few minutes later. A lot of professors try to solve this problem by handing out the lecture slides before class, or by posting them online. While this is great for a lot of students, it doesn’t work for me because I learn best and am most engaged if I have to take notes as if my grade depended on having a great record of the class and I would never see the material again. In classes with handouts, I tend to zone out and have to work harder to pay attention. Studies have shown[pdf] that taking high-quality notes improves organic memory: I rarely use my notes after the lecture because the act of physically writing information down helps me remember more of what goes on in class.

A few years ago I started phasing out PowerPoint from my upper-division classes (I never used it for grad classes). Carolyn hits on pretty much all the major reasons.

Teaching with PowerPoint has a different pace and structure than teaching with chalk or markers. It’s not just about overall fast vs. slow (though that’s part of it), but about when you go fast and when you go slow. When I use the board, I write down the major points, terms, definitions, etc. That forces me to slow down at exactly the moment when I’m making a big point and students should be attending closely. Once the critical information is on the board, I can elaborate, discuss with the class, ask questions, etc. while it hangs up there behind me for students to refer to. And since writing slows me down, I don’t give as much emphasis to relatively minor points — giving students an additional cue as to what’s more and less important. (“Don’t ignore this completely, but it’s not as central as what I said earlier.”) You can reproduce this kind of pacing and structure with PowerPoint, but in practice it’s difficult to do during a live performance in front of a classroom. You have to write your presentation with delivery (not just content) in mind. Otherwise it’s just too easy to blow through major and minor points at a constant pace.

Another point that she makes… I still use PowerPoint in my big introductory classes (though I make my own slides from scratch, use animation to help regulate my delivery, and try to avoid the mind-numbing bullety templates). I always have a few students ask me to post the notes before class. I don’t — I post them after class, but honestly, I have sometimes wondered if I’d be better off not posting them at all. Carolyn modestly writes “while [posting notes] is great for a lot of students, it doesn’t work for me…” but I actually think this describes most students. A lot of students misread their internal cues — if it feels like they are expending a lot of effort then they think they must be struggling with the material. Actually, though, if the professor is presenting challenging material, then you shouldn’t feel relaxed — relaxation is a sign that you’re probably thinking superficially or zoning out, not that you’ve quickly mastered the material.

I also found it impressive that Carolyn reached this conclusion on her own. Because frankly, it’s fundamentally very difficult to introspect into your own learning processes. A few years back, when I started moving away from PowerPoint, I got feedback on my student evaluations from people who wanted more PowerPoint. When I talked with students who felt that way, they thought they’d be able to focus more on the material if they didn’t have to bother taking notes. I realized that reflects a fundamental misunderstanding of what note-taking does for you. I’ve been getting less of that feedback lately — maybe because I’ve gotten better at using the board, or maybe because recent students have been around PowerPoint longer and see its limitations more clearly.

Say it again

When students learn writing, they often are taught that if you have to say the same kind of thing more than once, word things in a slightly different way each time. The idea is to add interest through variety.

But when I work with psychology students on their writing, I often have to work hard to break them of that habit. In scientific writing, precision and clarity are the most important. This doesn’t mean that scientific writing cannot also be elegant and interesting (the vary-the-wording strategy is often just a cheap trick anyhow). But your first priority is to make sure that your reader knows exactly what you mean.

Problems arise when journalists trained in vary-the-wording write about statistics. Small thing, but take this sentence from a Slate piece (in the oft-enlightening Explainer column) about the Fort Hood shooting:

Studies have shown that the suicide rate among male doctors is 40 percent higher than among men overall and that female doctors take their own lives at 130 percent the rate of women in general.

The same comparison is being made for men and for women: how does the suicide rate among doctors compare to the general population? But the numbers are not presented in parallel. For men, the number presented is 40, as in “40 percent higher than” men in general. For women, the number is 130, as in “130 percent the rate of” women in general.

The prepositions are the tipoff that the writer is doing different things, and a careful reader can probably figure that out. But the attempt to add variety just bogs things down. A reader will have to slow down and possibly re-read once or twice to figure out that 40% and 130% are both telling us that doctors commit suicide more often than others.

Separately: why break it out by gender? In context, the writer is trying to make a point about doctors versus everybody else. Not male doctors versus female doctors. We often reflexively categorize things by gender (I’m using “we” in a society-wide sense) when it’s unnecessary and uninformative.

Causality, genes, and the law

Ewen Callaway in New Scientist reports:

In 2007, Abdelmalek Bayout admitted to stabbing and killing a man and received a sentenced of 9 years and 2 months. Last week, Nature reported that Pier Valerio Reinotti, an appeal court judge in Trieste, Italy, cut Bayout’s sentence by a year after finding out he has gene variants linked to aggression. Leaving aside the question of whether this link is well enough understood to justify Reinotti’s decision, should genes ever be considered a legitimate defence?

Short answer: probably not.

Long answer: This reminds me of an issue I have with the Rubin Causal Model. In Holland’s 1986 paper on the RCM, he has a section titled “What can be a cause?” He introduces the notion of potential exposability – basically the idea that something can only be a cause if you could, in principle, manipulate it. He contrasts causes with attributes – features of individuals that are part of the definition of the individual. He uses as an example the statement, “She did well on the exam because she is a woman.” Gender can be statistically associated (correlated) with an outcome, but it cannot be a cause (according to Holland and I believe Rubin as well), because the person who did well on the exam would not be the same person if “she” weren’t a woman.

From a scientific/philosophical level, I’ve never liked the way they make the cause/attribute distinction. The RCM is so elegant and logical and principled, and then they tack on this very pragmatic and mushy issue of what can and cannot be manipulated. If technology changes to where something becomes manipulable, or if someone else thinks of a manipulation that escapes the researcher’s imagination (sex reassignment surgery?), things can shift back and forth from being classed as causes versus as attributes. Philosophically speaking: Blech. Plus, it leads to places I don’t really like. What about: “Jane didn’t get the job because she is a woman.” Is Holland saying that we cannot say that an applicant’s gender affected the employer’s hiring decision?

I think we just need to be better about defining the units and the nature of the counterfactuals. If we are trying to draw inferences about Jane, as she existed on a specific date and time and location, and therefore as a principled matter of defining the question (not as a pragmatic concern) we take as an a priori fact that Jane for the purposes of this problem has to be a woman, then okay, we’ve defined our problem space in a particular way that excludes “is a man” as a potential state of Jane. But if we are trying to draw inferences in which the units are exam-takers or job applicants, and Jane is one of many potential members of that population of units, then we’re dealing with a totally different question. In that case, we could have had either a man or a woman take the exam or apply for the job. Put another way: what is the counterfactual to Jane taking the exam or Jane applying for the job? If Jane could have been John for purposes of the problem that we are trying to solve, then it makes perfectly good sense to say that “Jane did well on the exam because she is a woman” is a coherent causal inference. It goes back to a principled matter of how we have defined the problem. Not a practical question of manipulability.

So back to the criminal… Holland (and Rubin) would make the question, “Is the MAOA-L variant a cause or an attribute?” And then they’d get into questions of whether you could manipulate that gene. And right now we cannot, so it’s an attribute; but maybe someday we’ll be able to, and then it’ll be a cause.

But I’d instead approach it by asking: what are the units, and what’s the counterfactual? To a scientist, it makes perfect sense to formulate a causal-inference problem in which the universe of units consists of all possible persons. Then we compare two persons whose genomes are entirely identical except for their MAOA variant, and we ask what the potential outcomes would be if one vs. the other was put in some situation that allows you to measure aggressive behavior. So the scientist gets to ask questions about MAOA causing aggression, because the scientist is drawing inferences about how persons behave, and MAOA is a variable across those units (generic persons).

But a court is supposed to ask different kinds of causal questions. The court judges the actual individual before it. And the units are potential or actual actions of that specific person as he existed on the day of the alleged crime. The units are not members of the generic category of persons. Thus, the court should not be considering what would happen if the real Abdelmalek Bayout had been replaced by a hypothetical almost-Bayout with a minutely different genome. A scientist can go there, but a court cannot. Rather, the court’s counterfactual is a different behavior from the very same real-world Abdelmalek Bayout, i.e., a Bayout who didn’t stab anybody on that day in 2007. And if Bayout had not stabbed anybody, there’d be no murder. But since he did, he caused a murder.

Addendum: it’s a totally different question of whether we want to hold all persons to the same standards. For example, we have the insanity defense. But there, it’s not a question of causality. In fact, defendants who plead insanity have to stipulate to the causal question (e.g. in a murder trial, they have to acknowledge that the defendant’s actions caused the death of another). The question before the court basically becomes a descriptive question — is this person sane or insane? — not a causal one.

Should we fire all the adjuncts (and hire them back for real)?

I just came across a thought-provoking interview with Cary Nelson, president of the AAUP. The video is titled Twilight of Academic Freedom. It deals with the consequences of increasing numbers of “contingent faculty” in higher education — the adjuncts, visiting professors, instructors, and various other titles for instructional staff who do not have the protections of tenure.

Right now, many universities are looking for ways to save money, and one way to do that is to hire fewer tenure-related faculty and shift the teaching burden onto adjuncts who are hired for as little as the uni can get away with paying. (It’s worth noting that this trend started well before the current recession, though I wouldn’t doubt that it’s accelerated.) Nelson is concerned about universities that are moving toward having an increasing share of teaching done by such contingent faculty.

Adjunct positions have a useful place in universities when used for the right reasons. One such reason is to expose students to perspectives that come from outside of the academy. For example, my undergraduate Abnormal Psychology class was taught by an adjunct whose main job was as a clinical psychologist at a hospital. That gave her a wealth of stories and practical experience that she could bring to the classroom.

But using adjuncts as a cost-cutting measure is a different thing. Many adjuncts will tell you that the system exploits instructors who work at low wages as a way to remain in the game while they hunt for better-paying permanent jobs. Those jobs typically don’t exist in high enough numbers to hire everybody who’s circling in the adjunct holding pattern.

Nelson offers a different line of argument, one that stems from the core reason tenure exists in the first place: academic freedom. To quote from the interview, “Academic freedom and job security are inextricably linked.” Tenure ensures that a professor can choose what to teach based on professional judgment. Direct review of those decisions is made by professional peers, protecting individual faculty from legislators, donors, regents, and others who might wield their considerable influence to drum out professors who don’t fit some outside agenda.

Nelson is not just worried about individual adjuncts being vulnerable. Even more ominous are the systemic risks of a university shifting to an adjunct-heavy portfolio. Hiring the occasional adjunct at an institution with a solid core of tenure-protected faculty is not likely to be a problem, as long as tenured faculty care enough about academic freedom that they’ll raise a stink if an adjunct is being subject to inappropriate pressure. (It’s sort of intellectual herd immunity.) But without that core, when too many of your faculty could be threatened for teaching something that someone does not like, the institution loses an important protection. Just look at the battles over secondary school textbooks in biology and history for an example of the kind of political infighting that can result. Is that where higher education could end up — with a state board telling me what to teach and what textbooks to use? I hope not, but Nelson presents good reasons to worry.

Rhymes with schmersonality

Kirstin Appelt of the Center for Research on Environmental Decisions at Columbia has put together a nifty online index of personality measures. It’s called the Decision Making Individual Differences Inventory, which abbreviates to DMIDI. The email announcement I just got helpfully points out that the name “rhymes with ‘p. diddy’.” That may be my second-favorite part.

My first favorite part is that it’s a cleanly-designed, well-put-together website that looks like it will have tons of useful information for researchers. And even though it has an emphasis on measures relevant to decision-making research, the site casts a pretty wide net, including a number of Big Five measures and measures of “cognitive ability.” The latter gets snoot-quotes because for some reason economists and JDM researchers don’t like the word “intelligence.” (For that matter, they have a pretty narrow view of the word “personality” too. The section for trait measures is simply labeled “personality,” which is somehow placed in contrast to measures of motivation, attitudes, cognitive style, and ability — all of which, of course, are part of what makes you the person that you are, i.e., your personality.)

But I digress. It’s still under construction, but it looks like it will be a great resource. The site is set up as a wiki, which raises the possibility that they’ll be able to harness the academic community’s energy in updating and expanding it. I can see why they might be cautious about going down that road (who wants to moderate an edit war between a bunch of cantankerous professors?), but even in its current form it’s really nice.