An interesting study of why unstructured interviews are so alluring

A while back I wrote about whether grad school admissions interviews are effective. Following up on that, Sam Gosling recently passed along an article by Dana, Dawes, and Peterson from the latest issue of Judgment and Decision Making:

Belief in the unstructured interview: The persistence of an illusion

Unstructured interviews are a ubiquitous tool for making screening decisions despite a vast literature suggesting that they have little validity. We sought to establish reasons why people might persist in the illusion that unstructured interviews are valid and what features about them actually lead to poor predictive accuracy. In three studies, we investigated the propensity for “sensemaking” – the ability for interviewers to make sense of virtually anything the interviewee says—and “dilution” – the tendency for available but non-diagnostic information to weaken the predictive value of quality information. In Study 1, participants predicted two fellow students’ semester GPAs from valid background information like prior GPA and, for one of them, an unstructured interview. In one condition, the interview was essentially nonsense in that the interviewee was actually answering questions using a random response system. Consistent with sensemaking, participants formed interview impressions just as confidently after getting random responses as they did after real responses. Consistent with dilution, interviews actually led participants to make worse predictions. Study 2 showed that watching a random interview, rather than personally conducting it, did little to mitigate sensemaking. Study 3 showed that participants believe unstructured interviews will help accuracy, so much so that they would rather have random interviews than no interview. People form confident impressions even interviews are defined to be invalid, like our random interview, and these impressions can interfere with the use of valid information. Our simple recommendation for those making screening decisions is not to use them.

It’s an interesting study. In my experience people’s beliefs in unstructured interviews are pretty powerful — hard to shake even when you show them empirical evidence.

I did have some comments on the design and analyses:

1. In Studies 1 and 2, each subject made a prediction about absolute GPA for 1 interviewee. So estimates of how good people are at predicting GPA from interviews are based on entirely between-subjects comparisons. It is very likely that a substantial chunk of the variance in predictions will be due to perceiver variance — differences between subjects in their implicit assumptions about how GPA is distributed. (E.g., Subject 1 might assume most GPAs range from 3 to 4, whereas Subject 2 assumes most GPAs range from 2.3 to 3.3. So even if they have the same subjective impression of the same target — “this person’s going to do great this term” — their numerical predictions might differ by a lot.) That perceiver variance would go into the denominator as noise variance in this study, lowering the interviewers’ predictive validity correlations.

Whether that’s a good thing or a bad thing depends on what situation you’re trying to generalize to. Perceiver variance would contribute to errors in judgment when each judge makes an absolute decision about a single target. On the other hand, in some cases perceivers make relative judgments about several targets, such as when an employer interviews several candidates and picks the best one. In that setting, perceiver variance would not matter, and a study with this design could underestimate accuracy.

2. Study 1 had 76 interviewers spread across 3 conditions (n = 25 or 26 per condition), and only 7 interviewees (each of whom was rated by multiple interviewers). Based on 73 degrees of freedom reported for the test of the “dilution” effect, it looks like they treated interviewer as the unit of analysis but did not account for the dependency in interviewees. Study 2 looked to have similar issues (though in Study 2 the dilution effect was not significant.)

3. I also had concerns about power and precision of the estimates. Any inferences about who makes better or worse predictions will depend a lot on variance among the 7 interviewees whose GPAs were being predicted (8 interviewees in study 2). I haven’t done a formal power analysis, but my intuition is that that’s pretty small. You can see a possible sign of this in one key difference between the studies. In Study 1, the correlation between the interviewees’ prior GPA and upcoming GPA was r = .65, but in Study 2 it was r = .37. That’s a pretty big difference between estimates of a quantity that should not be changing between studies.

So it’s an interesting study but not one that can give answers I’d call definitive. If that’s well understood by readers of the study, I’m okay with that. Maybe someone will use the interesting ideas in this paper as a springboard for a larger followup. Given the ubiquity of unstructured interviews, it’s something we need to know more about.

6 thoughts on “An interesting study of why unstructured interviews are so alluring

  1. Fascinating paper, but troubling in that, from what I could tell, all three of the component studies were conducted using students, mainly undergraduates. On the one hand, this at least levels the playing field; but on the other, there are potentially nontrivial problems with unaccounted-for heterogeneity: some students will likely be far more adept at discernment than others. Even more troubling is extrapolating from studies using UNTRAINED undergraduate interviewers to similar implications for professionals with, in most cases, years of experience. Those studies, of course, are far harder to carry out, so researchers tend not to. Would we also extrapolate these to college admissions directors? That would be pretty extreme.

    There are sub-areas of psychology, like basic neuro studies, where using undergrads is a reasonable proxy for humanity in general. Social psychology is somewhat more of a stretch. And this sort of study (organizational psych? not sure how to classify it, other than perhaps some form of BDT) more farfetched still. Intriguing results, but it would be like extrapolating Zimbardo’s studies to what would happen with trained prison guards.

      1. Thanks! The R code is here (http://journal.sjdm.org/12/121130a/interviewanalyses.R), and indicates they did use a multi-level model, using the LMER command in LME4 (http://cran.r-project.org/web/packages/lme4/index.html). The paper is oddly silent on this aspect of the methodology.

        Regardless, this cannot get around the use of untrained undergrads as subjects. The study should really only extrapolate to people “like” those in the study, which would, I should think, preclude trained professionals with years of empirical data and experience.

  2. Melinda Blackman (full disclosure; former student of mine) published a paper comparing the validity of structured and unstructured interviews against self and peer ratings of personality. The hypothesis derived from the idea that an unstructured interview is a “weak” situation and therefore should yield a wider range of behaviors more diagnostic of personality. The study uses undergraduates but a relevant job description was carefully constructed, and the limitations and questions for further research are frankly described. Here’s the reference:
    Melinda C. Blackman (2002) Personality Judgment and the Utility of the Unstructured Employment Interview, Basic and Applied Social Psychology, 24:3, 241-250
    To link to this article:
    http://dx.doi.org/10.1207/S15324834BASP2403_6

Comments are closed.