Do not use what I am about to teach you

I am gearing up to teach Structural Equation Modeling this fall term. (We are on quarters, so we start late — our first day of classes is next Monday.)

Here’s the syllabus. (pdf)

I’ve taught this course a bunch of times now, and each time I teach it I add more and more material on causal inference. In part it’s a reaction to my own ongoing education and evolving thinking about causation, and in part it’s from seeing a lot of empirical work that makes what I think are poorly supported causal inferences. (Not just articles that use SEM either.)

Last time I taught SEM, I wondered if I was heaping on so many warnings and caveats that the message started to veer into, “Don’t use SEM.” I hope that is not the case. SEM is a powerful tool when used well. I actually want the discussion of causal inference to help my students think critically about all kinds of designs and analyses. Even people who only run randomized experiments could benefit from a little more depth than the sophomore-year slogan that seems to be all some researchers (AHEM, Reviewer B) have been taught about causation.

Modeling the Jedi Theory of Emotions

Today I gave my structural equation modeling class the following homework:

In Star Wars I: The Phantom Menace, Yoda presented the Jedi Theory of Emotions:  “Fear is the path to the dark side. Fear leads to anger. Anger leads to hate. Hate leads to suffering.”

1. Specify the Jedi Theory of Emotions as a path model with 4 variables (FEAR, ANGER, HATE, and SUFFERING). Draw a complete path diagram, using lowercase Roman letters (a, b, c, etc.) for the causal parameters.

2. Were there any holes or ambiguities in the Jedi Theory (as stated by Yoda) that required you to make theoretical assumptions or guesses? What were they?

3. Using the tracing rule, fill in the model-implied correlation matrix (assuming that all variables are standardized):

FEAR ANGER HATE SUFFERING
FEAR 1
ANGER 1
HATE 1
SUFFERING 1

4. Generate a plausible equivalent model. (An equivalent model is a model that specifies a different causal structure but implies the same correlation matrix.)

5. Suppose you run a study and collect data on these four variables. Your data gives you the following correlation matrix.

FEAR ANGER HATE SUFFERING
FEAR 1
ANGER .5 1
HATE .3 .6 1
SUFFERING .4 .3 .5 1

Is the Jedi Theory a good fit to the data? In what way(s), if any, would you revise the model?

Some comments…

For #1, everybody always comes up with a recursive, full mediation model — e.g., fear only causes hate via anger as an intervening cause, and there are no loops or third-variable associations between fear and hate, etc. It’s an opportunity to bring up the ambiguity of theories expressed in natural language: just because Yoda didn’t say “and anger can also cause fear sometimes too,” does that mean he’s ruling that out?

Relatedly, observational data will only give you unbiased causal estimates — of the effect of fear on anger, for example — if you assume that Yoda gave a complete and correct specification of the true causal structure (or if you fill in the gaps yourself and include enough constraints to identify the model). How much do you trust Yoda’s model? Questions 4 and 5 are supposed to help students to think about ways in which the model could and could not be falsified.

In a comment on an earlier post, I repeated an observation I once heard someone make, that psychologists tend to model all relationships as zero unless given reason to think otherwise, whereas econometricians tend to model all relationships as free parameters unless given reason to think otherwise. I’m not sure why that is the case (maybe a legacy of NHST in experimental psychology, where you’re supposed to start by hypothesizing a zero relationship and then look for reasons to reject that hypothesis). At any rate, if you think like an econometrician and come from the no true zeroes school of thought, you’ll need something more than just observational data on 4 variables in order to test this model. That makes the Jedi Theory a tough nut to crack. Experimental manipulation gets ethically more dubious as you proceed down the proposed causal chain. And I’m not sure how easy it would be to come up with good instruments for all of these variables.

I also briefly worried that I might be sucking the enjoyment out of the movie. But then I remembered that the quote is from The Phantom Menace, so that’s already been done.

Prepping for SEM

I’m teaching the first section of a structural equation modeling class tomorrow morning. This is the 3rd time I’m teaching the course, and I find that the more times I teach it, the less traditional SEM I actually cover. I’m dedicating quite a bit of the first week to discussing principles of causal inference, spending the second week re-introducing regression as a modeling framework (rather than a toolbox statistical test), and returning to causal inference later when we talk about path analysis and mediation (including assigning a formidable critique by John Bullock et al. coming out soon in JPSP).

The reason I’m moving in that direction is that I’ve found that a lot of students want to rush into questionable uses of SEM without understanding what they’re getting into. I’m probably guilty of having done that, and I’ll probably do it again someday, but I’d like to think I’m learning to be more cautious about the kinds of inferences I’m willing to make. To people who don’t know better, SEM often seems like magical fairy dust that you can sprinkle on cross-sectional observational data to turn it into something causally conclusive. I’ve probably been pretty far on the permissive end of the spectrum that Andrew Gelman talks about, in part because I think experimental social psychology sometimes overemphasizes internal validity to the exclusion of external validity (and I’m not talking about the special situations that Mook gets over-cited for). But I want to instill an appropriate level of caution.

BTW, I just came across this quote from Donald Campbell and William Shadish: “When it comes to causal inference from quasi-experiments, design rules, not statistics.” I’d considered writing “IT’S THE DESIGN, STUPID” on the board tomorrow morning, but they probably said it nicer.