Will this time be different?

I had the honor to deliver the closing address to the Society for the Improvement of Psychological Science on July 9, 2019 in Rotterdam. The following are my prepared remarks. (These remarks are also archived on PsyArXiv.)

Some years ago, not long after people in psychology began talking in earnest about a replication crisis and what to do about it, I was talking with a colleague who has been around the field for longer than I have. He said to me, “Oh, this is all just cyclical. Psychology goes through a bout of self-flagellation every decade or two. It’ll be over soon and nothing will be different.”

I can’t say I blame him. Psychology has had other periods of reform that have fizzled out. One of the more recent ones was the statistical reform effort of the late 20th century – and you should read Fiona Fidler’s history of it, because it is completely fascinating. Luminaries like Jacob Cohen, Paul Meehl, Robert Rosenthal, and others were members and advisors of a blue-ribbon APA task force to change the practice of statistics in psychology. This resulted in the APA style manual adding a requirement to report effect sizes – one which is occasionally even followed, though the accompanying call to interpret effect sizes has gotten much less traction – and a few other modest improvements. But it was nothing like the sea change that many of them believed was needed.

But flash forward to the current decade. When people ask themselves, “Will this time be different?” it is fair to say there is a widespread feeling that indeed it could be. There is no single reason. Instead, as Bobbie Spellman and others have written, it is a confluence of contributing factors.

One of them is technology. The flow of scientific information is no longer limited by what we can print on sheets of pulped-up tree guts bound together into heavy volumes and sent by truck, ship, and airplane around the world. Networked computing and storage means that we can share data, materials, code, preprints, and more, at a quantity and speed that was barely imagined even a couple of decades ago when I was starting graduate school. Technology has given scientists far better ways to understand and verify the work we are building on, collaborate, and share what we have discovered.

A second difference is that more people now view the problem not just as an analytic one – the domain of logicians and statisticians – but also, complementarily, as a human one. So, for example, the statistical understanding of p-values as a function of a model and data has been married to a social-scientific understanding: p-values are also a function of the incentives and institutions that the people calculating them are working under. We see meta-scientists collecting data and developing new theories of how scientific knowledge is produced. More people see the goal of studying scientific practice not just as diagnosis – identify a problem, write a paper about it, and go to your grave knowing you were right about something – but also of designing effective interventions and embracing the challenges of scaling up to implementation.

A third difference, and perhaps the most profound, is where the ideas and energy are coming from. A popular debate on Twitter is what to call this moment in our field’s history. Is it a crisis? A renaissance? A revolution? One term that gets used a lot is “open science movement.” Once, when this came up on Twitter, I asked a friend who’s a sociologist what he thought. He stared at me for a good three seconds, like I’d just grown a second head, and said: “OF COURSE it’s a social movement.” (It turns out that people debating “are we a social movement?” is classic social movement behavior.) I think that idea has an underappreciated depth to it. Because maybe the biggest difference is that what we are seeing now is truly a grassroots social movement.

What does it mean to take seriously the idea that open science is a social movement? Unlike blue-ribbon task forces, social movements do not usually have a single agenda or a formal charge. They certainly aren’t made up of elites handpicked by august institutions. Instead, movements are coalitions – of individuals, groups, communities, and organizations that have aligned, but often not identical, values, priorities, and ideas.

We see that pluralism in the open science movement. To take just one example, many in psychology see a close connection between openness and rigor. We trace problems with replicability and cumulative scientific progress back, in part, to problems with transparency. When we cannot see details of the methods used to produce important findings, when we cannot see what the data actually look like, when we cannot verify when in the research process key decisions were made, then we cannot properly evaluate claims and evidence. But another very different argument for openness is about access and justice: expanding who gets to see and benefit from the work of scientists, join the discourse around it, and do scientific work. Issues of access would be important no matter how replicable and internally rigorous our science was. Of course, many – and I count myself among them – embrace both of these as animating concerns, even if we came to them from different starting points. That’s one of the powerful things that can happen when movements bring together people with different ideas and different experiences. But as the movement grows and matures, the differences will increase too. Different concerns and priorities will not always be so easily aligned. We need to be ready for that.

SIPS is not the open science movement – the movement is much bigger than we are. Nobody has to be a part of SIPS to do open science or be part of the movement. We should never make the mistake of believing that a SIPS membership defines open science, as my predecessor Katie Corker told us so eloquently last year. But we have the potential to be a powerful force for good within the movement. When SIPS had its first meeting just three years ago, it felt like a small, ragtag band of outsiders who had just discovered they weren’t alone. Now look at us. We have grown in size so fast that our conference organizers could barely keep up. 525 people flew from around the world to get together and do service. Signing up for service! (Don’t tell your department chair.) People are doing it because they believe in our mission and want to do something about it.

This brings me to what I see as the biggest challenge that lies ahead for SIPS. As we have grown and will continue to grow, we need to be asking: What do we do about differences? Both the differences that already exist in our organization, and the differences that could be represented here but aren’t yet. Differences in backgrounds and identities, differences in culture and geography, differences in subfields and interests and approaches. To which my answer is: Differences can be our strength. But that won’t happen automatically. It will take deliberation, intent, and work to make them an asset.

What does that mean? Within the open science movement, many have been working on improvements. But there is a natural tendency for people to craft narrow solutions that just work for themselves, and for people and situations they know. SIPS is at its best when it breaks through that, when it brings together people with different knowledge and concerns to work together. When a discussion about getting larger and more diverse samples includes people who come from different kinds of institutions who have access to different resources, different organizational and technical skills, but see common cause, we get the Psychological Science Accelerator. When people who work with secondary data are in the room talking about preregistration, then instead of another template for a simple two-by-two, we get an AMPPS paper about preregistration for existing data. When mixed-methods researchers feel welcomed one year, they come back the next year with friends and organize a whole session on open qualitative research.

Moving forward, for SIPS to continue to be a force for good, we have to take the same expectations we have of our science and apply them to our movement, our organization, and ourselves. We have to listen to criticism from both within and outside of the society and ask what we can learn from it. Each one of us has to take diversity and inclusion as our own responsibility and ask ourselves, how can I make this not some nice add-on, but integral to the way I am trying to improve psychology? We have to view self-correction and improvement – including improvement in how diverse and inclusive we are – as an ongoing task, not a project we will finish and move on from.

I say this not just as some nice paean to diversity, but as an existential task for SIPS and the open science movement. This is core to our values. If we remake psychological science into something that works smashingly well for the people in this room, but not for anyone else, we will have failed at our mission. The history of collective human endeavors, including social movements – the ways they can reproduce sexism and racism and other forms of inequality, and succumb to power and prestige and faction – gives us every reason to be on our guard. But the energy, passion, and ideals I’ve seen expressed these last few days by the people in this room give me cause for hope. We are, at the end of the day, a service organization. Hundreds of people turned up in Rotterdam to try to make psychology better not just for themselves, but for the world.

So when people ask, “Will this time be different?” my answer is this: Don’t ever feel certain that the answer is yes, and maybe this time it will be.

Improving Psychological Science at SIPS

Last week was the second meeting of the Society for the Improvement of Psychological Science, a.k.a. SIPS[1]. SIPS is a service organization with the mission of advancing and supporting all of psychological science. About 200 people met in Charlottesville, VA to participate in hackathons and lightning talks and unconference sessions, go to workshops, and meet other people interested in working to improve psychology.

What Is This Thing Called SIPS?

If you missed SIPS and are wondering what happened – or even if you were there but want to know more about the things you missed – here are a few resources I have found helpful:

The conference program gives you an overview and the conference OSF page has links to most of what went on, though it’s admittedly a lot to dig through. For an easier starting point, Richie Lennie posted an email he wrote to his department with highlights and links, written specifically with non-attendees in mind.

Drilling down one level from the conference OSF page, all of the workshop presenters put their materials online. I didn’t make it to any workshops so I appreciate having access to those resources. One good example is Simine Vazire and Bobbie Spellman’s workshop on writing transparent and reproducible articles. Their slideshow shows excerpts from published papers on things like how to transparently report exploratory analyses, how to report messy results, how to interpret a null result, and more. For me, writing is a lot easier when I have examples and models to work from, and I expect that I will be referring to those in the future.

The list of hackathon OSF pages is worth browsing. Hackathons are collaborative sessions for people interested in working on a defined project. Organizers varied in how much they used OSF – some used them mainly for internal organization, while others hosted finished or near-finished products on them. A standout example of the latter category is from the graduate research methods course hackathon. Their OSF wiki has a list of 31 topics, almost all of which are live links to pages with learning goals, reading lists, demonstrations, and assignments. If you teach grad research methods, or anything else with methodsy content, go raid the site for all sorts of useful materials.

The program also had space for smaller or less formal events. Unconferences were spontaneously organized sessions, some of which grew into bigger projects. Lightning talks were short presentations, often about work in progress.

As you browse through the resources, it is also worth keeping in the back of your mind that many projects get started at SIPS but not finished there, so look for more projects to come to fruition in the weeks and months ahead.

A challenge for future SIPS meetings is going to be figuring out how to reach beyond the people physically attending the meeting and get the broadest possible engagement, as well as to support dissemination of projects and initiatives that people create at SIPS. We have already gotten some valuable feedback about how other hackathons and unconferences manage that. This year’s meeting happened because of a Herculean effort by a very small group of volunteers[2] operating on a thin budget (at one point it was up in the air whether there’d be even wifi in the meeting space, if you can believe it) who had to plan an event that doubled in size from last year. As we grow we will always look for more and better ways to engage – the I in SIPS would not count for anything if the society did not apply it to itself.

My Personal Highlights

It is hard to summarize but I will mention a few highlights from things that I saw or participated in firsthand.

Neil Lewis Jr. and I co-organized a hackathon on diversity and inclusion in open science. We had so many people show up that we eventually split into five smaller groups working on different projects. My group worked on helping SIPS-the-organization start to collect member data so it can track how it is doing with respect to its diversity and inclusion goals. I posted a summary on the OSF page and would love to get feedback. (Neil is working on a guest post, so look for more here about that hackathon in the near future.)

Another session I participated in was the “diversity re-hack” on day two. The idea was that diversity and inclusion are relevant to everything, not just what comes up at a hackathon with “diversity and inclusion” in the title. So people who had worked on all the other hackathons on day one could come and workshop their in-progress projects to make them serve those goals even better. It was another well-attended session and we had representatives from nearly every hackathon group come to participate.

Katie Corker was the first recipient of the society’s first award, the SIPS Leadership Award. Katie has been instrumental in the creation of the society and in organizing the conference, and beyond SIPS she has also been a leader in open science in the academic community. Katie is a dynamo and deserves every bit of recognition she gets.

It was also exciting to see projects that originated at the 2016 SIPS meeting continuing to grow. During the meeting, APA announced that it will designate PsyArXiv as its preferred preprint server. And the creators of StudySwap, which also came out of SIPS 2016, just announced an upcoming Nexus (a fancy term for what we called “special issue” in the print days) with the journal Collabra: Psychology on crowdsourced research.

Speaking of which, Collabra: Psychology is now the official society journal of SIPS. It is fitting that SIPS partnered with an open-access journal, given the society’s mission. SIPS will oversee editorial responsibilities and the scientific mission of the journal, while the University of California Press will operate as the publisher.

But probably the most gratifying thing for me about SIPS was meeting early-career researchers who are excited about making psychological science more open and transparent, more rigorous and self-correcting, and more accessible and inclusive of everyone who wants to do science or could benefit from science. The challenges can sometimes feel huge, and I found it inspiring and energizing to spend time with people just starting out in the field who are dedicated to facing them.

*****

1. Or maybe it was the first meeting, since we ended last year’s meeting with a vote on whether to become a society, even though we were already calling ourselves that? I don’t know, bootstrapping is weird.

2. Not including me. I am on the SIPS Executive Committee so I got to see up close the absurd amount of work that went into making the conference. Credit for the actual heavy lifting goes to Katie Corker and Jack Arnal, the conference planning committee who made everything happen with the meeting space, hotel, meals, and all the other logistics; and the program committee of Brian Nosek, Michèle Nuijten, John Sakaluk, and Alexa Tullett, who were responsible for putting together the scientific (and, uh, I guess meta-scientific?) content of the conference.

Learning exactly the wrong lesson

For several years now I have heard fellow scientists worry that the dialogue around open and reproducible science could be used against science – to discredit results that people find inconvenient and even to de-fund science. And this has not just been fretting around the periphery. I have heard these concerns raised by scientists who hold policymaking positions in societies and journals.

A recent article by Ed Yong talks about this concern in the present political climate.

In this environment, many are concerned that attempts to improve science could be judo-flipped into ways of decrying or defunding it. “It’s been on our minds since the first week of November,” says Stuart Buck, Vice President of Research Integrity at the Laura and John Arnold Foundation, which funds attempts to improve reproducibility.

The worry is that policy-makers might ask why so much money should be poured into science if so many studies are weak or wrong? Or why should studies be allowed into the policy-making process if they’re inaccessible to public scrutiny? At a recent conference on reproducibility run by the National Academies of Sciences, clinical epidemiologist Hilda Bastian says that she and other speakers were told to consider these dangers when preparing their talks.

One possible conclusion is that this means we should slow down science’s movement toward greater openness and reproducibility. As Yong writes, “Everyone I spoke to felt that this is the wrong approach.” But as I said, those voices are out there and many could take Yong’s article as reinforcing their position. So I think it bears elaboration why that would be the wrong approach.

Probably the least principled reason, but an entirely unavoidable practical one, is just that it would be impossible. The discussion cannot be contained. Notwithstanding some defenses of gatekeeping and critiques of science discourse on social media (where much of this discussion is happening), there is just no way to keep scientists from talking about these issues in the open.

And imagine for a moment that we nevertheless tried to contain the conversation. Would that be a good idea? Consider the “climategate” faux-scandal. Opponents of climate science cooked up an anti-transparency conspiracy out of a few emails that showed nothing of the sort. Now imagine if we actually did that – if we kept scientists from discussing science’s problems in the open. And imagine that getting out. That would be a PR disaster to dwarf any misinterpretation of open science (because the worst PR disasters are the ones based in reality).

But to me, the even more compelling consideration is that if we put science’s public image first, we are inverting our core values. The conversation around open and reproducible science cuts to fundamental questions about what science is – such as that scientific knowledge is verifiable, and that it belongs to everyone – and why science offers unique value to society. We should fully and fearlessly engage in those questions and in making our institutions and practices better. We can solve the PR problem after that. In the long run, the way to make the best possible case for science is to make science the best possible.

Rather than shying away from talking about openness and reproducibility, I believe it is more critical than ever that we all pull together to move science forward. Because if we don’t, others will make changes in our name that serve other agendas.

For example, Yong’s article describes a bill pending in Congress that would set impossibly high standards of evidence for the Environmental Protection Agency to base policy on. Those standards are wrapped in the rhetoric of open science. But as Michael Eisen says in the article, “It won’t produce regulations based on more open science. It’ll just produce fewer regulations.” This is almost certainly the intended effect.

As long as scientists – individually and collectively in our societies and journals – drag our heels on making needed reforms, there will be a vacuum that others will try to fill. Turn that around, and the better the scientific community does its job of addressing openness and transparency in the service of actually making science do what science is supposed to do – making it more open, more verifiable, more accessible to everyone – the better positioned we will be to rebut those kinds of efforts by saying, “Nope, we got this.”

Replicability in personality psychology, and the symbiosis between cumulative science and reproducible science

There is apparently an idea going around that personality psychologists are sitting on the sidelines having a moment of schadenfreude during the whole social psychology Replicability Crisis thing.

Not true.

The Association for Research in Personality conference just wrapped up in St. Louis. It was a great conference, with lots of terrific research. (Highlight: watching three of my students give kickass presentations.) And the ongoing scientific discussion about openness and reproducibility had a definite, noticeable effect on the program.

The most obvious influence was the (packed) opening session on reproducibility. First, Rich Lucas talked about the effects of JRP’s recent policy of requiring authors to explicitly talk about power and sample size decisions. The policy has had a noticeable impact on sample sizes of published papers, without major side effects like tilting toward college samples or cheap self-report measures.

Second, Simine Vazire talked about the particular challenges of addressing openness and replicability in personality psychology. A lot of the discussion in psychology has been driven by experimental psychologists, and Simine talked about what the general issues that cut across all of science look like when applied in particular to personality psychology. One cool recommendation she had (not just for personality psychologists) was to imagine that you had to include a “Most Damning Result” section in your paper, where you had to report the one result that looked worst for your hypothesis. How would that change your thinking?*

Third, David Condon talked about particular issues for early-career researchers, though really it was for anyone who wants to keep learning – he had a charming story of how he was inspired by seeing one of his big-name intellectual heroes give a major award address at a conference, then show up the next morning for an “Introduction to R” workshop. He talked a lot about tools and technology that we can use to help us do more open, reproducible science.

And finally, Dan Mroczek talked about research he has been doing with a large consortium to try to do reproducible research with existing longitudinal datasets. They have been using an integrated data analysis framework as a way of combining longitudinal datasets to test novel questions, and to look at issues like generalizability and reproducibility across existing data. Dan’s talk was a particularly good example of why we need broad participation in the replicability conversation. We all care about the same broad issues, but the particular solutions that experimental social psychologists identify aren’t going to work for everybody.

In addition to its obvious presence in the plenary session, reproducibility and openness seemed to suffuse the conference. As Rick Robins pointed out to me, there seemed to be a lot more people presenting null findings in a more open, frank way. And talk of which findings were replicated and which weren’t, people tempering conclusions from initial data, etc. was common and totally well received like it was a normal part of science. Imagine that.

One things that stuck out at me in particular was the relationship between reproducible science and cumulative science. Usually I think of the first helping the second; you need robust, reproducible findings as a foundation before you can either dig deeper into process or expand out in various ways. But in many ways, the conference reminded me that the reverse is true as well: cumulative science helps reproducibility.

When people are working on the same or related problems, using the same or related constructs and measures, etc. then it becomes much easier to do robust, reproducible science. In many ways structural models like the Big Five have helped personality psychology with that. For example, the integrated data analysis that Dan talked about requires you to have measures of the same constructs in every dataset. The Big Five provide a common coordinate system to map different trait measures onto, even if they weren’t originally conceptualized that way. Psychology needs more models like that in other domains – common coordinate systems of constructs and measures that help make sense of how different research programs fit together.

And Simine talked about (and has blogged about) the idea that we should collect fewer but better datasets, with more power and better but more labor-intensive methods. If we are open with our data, we can do something really well, and then combine or look across datasets better to take advantage of what other people do really well – but only if we are all working on the same things so that there is enough useful commonality across all those open datasets.

That means we need to move away from a career model of science where every researcher is supposed to have an effect, construct, or theory that is their own little domain that they’re king or queen of. Personality psychology used to be that way, but the Big Five has been a major counter to that, at least in the domain of traits. That kind of convergence isn’t problem-free — the model needs to evolve (Big Six, anyone?), which means that people need the freedom to work outside of it; and it can’t try to subsume things that are outside of its zone of relevance. Some people certainly won’t love it – there’s a certain satisfaction to being the World’s Leading Expert on X, even if X is some construct or process that only you and maybe your former students are studying. But that’s where other fields have gone, even going as far as expanding beyond the single-investigator lab model: Big Science is the norm in many parts of physics, genomics, and other fields. With the kinds of problems we are trying to solve in psychology – not just our reproducibility problems, but our substantive scientific ones — that may increasingly be a model for us as well.

 

———-

* Actually, I don’t think she was only imagining. Simine is the incoming editor at SPPS.** Give it a try, I bet she’ll desk-accept the first paper that does it, just on principle.

** And the main reason I now have footnotes in most of my blog posts.