Will this time be different?

I had the honor to deliver the closing address to the Society for the Improvement of Psychological Science on July 9, 2019 in Rotterdam. The following are my prepared remarks. (These remarks are also archived on PsyArXiv.)

Some years ago, not long after people in psychology began talking in earnest about a replication crisis and what to do about it, I was talking with a colleague who has been around the field for longer than I have. He said to me, “Oh, this is all just cyclical. Psychology goes through a bout of self-flagellation every decade or two. It’ll be over soon and nothing will be different.”

I can’t say I blame him. Psychology has had other periods of reform that have fizzled out. One of the more recent ones was the statistical reform effort of the late 20th century – and you should read Fiona Fidler’s history of it, because it is completely fascinating. Luminaries like Jacob Cohen, Paul Meehl, Robert Rosenthal, and others were members and advisors of a blue-ribbon APA task force to change the practice of statistics in psychology. This resulted in the APA style manual adding a requirement to report effect sizes – one which is occasionally even followed, though the accompanying call to interpret effect sizes has gotten much less traction – and a few other modest improvements. But it was nothing like the sea change that many of them believed was needed.

But flash forward to the current decade. When people ask themselves, “Will this time be different?” it is fair to say there is a widespread feeling that indeed it could be. There is no single reason. Instead, as Bobbie Spellman and others have written, it is a confluence of contributing factors.

One of them is technology. The flow of scientific information is no longer limited by what we can print on sheets of pulped-up tree guts bound together into heavy volumes and sent by truck, ship, and airplane around the world. Networked computing and storage means that we can share data, materials, code, preprints, and more, at a quantity and speed that was barely imagined even a couple of decades ago when I was starting graduate school. Technology has given scientists far better ways to understand and verify the work we are building on, collaborate, and share what we have discovered.

A second difference is that more people now view the problem not just as an analytic one – the domain of logicians and statisticians – but also, complementarily, as a human one. So, for example, the statistical understanding of p-values as a function of a model and data has been married to a social-scientific understanding: p-values are also a function of the incentives and institutions that the people calculating them are working under. We see meta-scientists collecting data and developing new theories of how scientific knowledge is produced. More people see the goal of studying scientific practice not just as diagnosis – identify a problem, write a paper about it, and go to your grave knowing you were right about something – but also of designing effective interventions and embracing the challenges of scaling up to implementation.

A third difference, and perhaps the most profound, is where the ideas and energy are coming from. A popular debate on Twitter is what to call this moment in our field’s history. Is it a crisis? A renaissance? A revolution? One term that gets used a lot is “open science movement.” Once, when this came up on Twitter, I asked a friend who’s a sociologist what he thought. He stared at me for a good three seconds, like I’d just grown a second head, and said: “OF COURSE it’s a social movement.” (It turns out that people debating “are we a social movement?” is classic social movement behavior.) I think that idea has an underappreciated depth to it. Because maybe the biggest difference is that what we are seeing now is truly a grassroots social movement.

What does it mean to take seriously the idea that open science is a social movement? Unlike blue-ribbon task forces, social movements do not usually have a single agenda or a formal charge. They certainly aren’t made up of elites handpicked by august institutions. Instead, movements are coalitions – of individuals, groups, communities, and organizations that have aligned, but often not identical, values, priorities, and ideas.

We see that pluralism in the open science movement. To take just one example, many in psychology see a close connection between openness and rigor. We trace problems with replicability and cumulative scientific progress back, in part, to problems with transparency. When we cannot see details of the methods used to produce important findings, when we cannot see what the data actually look like, when we cannot verify when in the research process key decisions were made, then we cannot properly evaluate claims and evidence. But another very different argument for openness is about access and justice: expanding who gets to see and benefit from the work of scientists, join the discourse around it, and do scientific work. Issues of access would be important no matter how replicable and internally rigorous our science was. Of course, many – and I count myself among them – embrace both of these as animating concerns, even if we came to them from different starting points. That’s one of the powerful things that can happen when movements bring together people with different ideas and different experiences. But as the movement grows and matures, the differences will increase too. Different concerns and priorities will not always be so easily aligned. We need to be ready for that.

SIPS is not the open science movement – the movement is much bigger than we are. Nobody has to be a part of SIPS to do open science or be part of the movement. We should never make the mistake of believing that a SIPS membership defines open science, as my predecessor Katie Corker told us so eloquently last year. But we have the potential to be a powerful force for good within the movement. When SIPS had its first meeting just three years ago, it felt like a small, ragtag band of outsiders who had just discovered they weren’t alone. Now look at us. We have grown in size so fast that our conference organizers could barely keep up. 525 people flew from around the world to get together and do service. Signing up for service! (Don’t tell your department chair.) People are doing it because they believe in our mission and want to do something about it.

This brings me to what I see as the biggest challenge that lies ahead for SIPS. As we have grown and will continue to grow, we need to be asking: What do we do about differences? Both the differences that already exist in our organization, and the differences that could be represented here but aren’t yet. Differences in backgrounds and identities, differences in culture and geography, differences in subfields and interests and approaches. To which my answer is: Differences can be our strength. But that won’t happen automatically. It will take deliberation, intent, and work to make them an asset.

What does that mean? Within the open science movement, many have been working on improvements. But there is a natural tendency for people to craft narrow solutions that just work for themselves, and for people and situations they know. SIPS is at its best when it breaks through that, when it brings together people with different knowledge and concerns to work together. When a discussion about getting larger and more diverse samples includes people who come from different kinds of institutions who have access to different resources, different organizational and technical skills, but see common cause, we get the Psychological Science Accelerator. When people who work with secondary data are in the room talking about preregistration, then instead of another template for a simple two-by-two, we get an AMPPS paper about preregistration for existing data. When mixed-methods researchers feel welcomed one year, they come back the next year with friends and organize a whole session on open qualitative research.

Moving forward, for SIPS to continue to be a force for good, we have to take the same expectations we have of our science and apply them to our movement, our organization, and ourselves. We have to listen to criticism from both within and outside of the society and ask what we can learn from it. Each one of us has to take diversity and inclusion as our own responsibility and ask ourselves, how can I make this not some nice add-on, but integral to the way I am trying to improve psychology? We have to view self-correction and improvement – including improvement in how diverse and inclusive we are – as an ongoing task, not a project we will finish and move on from.

I say this not just as some nice paean to diversity, but as an existential task for SIPS and the open science movement. This is core to our values. If we remake psychological science into something that works smashingly well for the people in this room, but not for anyone else, we will have failed at our mission. The history of collective human endeavors, including social movements – the ways they can reproduce sexism and racism and other forms of inequality, and succumb to power and prestige and faction – gives us every reason to be on our guard. But the energy, passion, and ideals I’ve seen expressed these last few days by the people in this room give me cause for hope. We are, at the end of the day, a service organization. Hundreds of people turned up in Rotterdam to try to make psychology better not just for themselves, but for the world.

So when people ask, “Will this time be different?” my answer is this: Don’t ever feel certain that the answer is yes, and maybe this time it will be.

Reflections on SIPS (guest post by Neil Lewis, Jr.)

The following is a guest post by Neil Lewis, Jr. Neil is an assistant professor at Cornell University.

Last week I visited the Center for Open Science in Charlottesville, Virginia to participate in the second annual meeting of the Society for the Improvement of Psychological Science (SIPS). It was my first time going to SIPS, and I didn’t really know what to expect. The structure was unlike any other conference I’ve been to—it had very little formal structure—there were a few talks and workshops here and there, but the vast majority of the time was devoted to “hackathons” and “unconference” sessions where people got together and worked on addressing pressing issues in the field: making journals more transparent, designing syllabi for research methods courses, forming a new journal, changing departmental/university culture to reward open science practices, making open science more diverse and inclusive, and much more. Participants were free to work on whatever issues we wanted to and to set our own goals, timelines, and strategies for achieving those goals.

I spent most of the first two days at the diversity and inclusion hackathon that Sanjay and I co-organized. These sessions blew me away. Maybe we’re a little cynical, but going into the conference we thought maybe two or three people would stop by and thus it would essentially be the two of us trying to figure out what to do to make open science more diverse and inclusive. Instead, we had almost 40 people come and spend the first day identifying barriers to diversity and inclusion, and developing tools to address those barriers. We had sub-teams working on (1) improving measurement of diversity statistics (hard to know how much of a diversity problem one has if there’s poor measurement), (2) figuring out methods to assist those who study hard-to-reach populations, (3) articulating the benefits of open science and resources to get started for those who are new, (4) leveraging social media for mentorship on open science practices, and (5) developing materials to help PIs and institutions more broadly recruit and retain traditionally underrepresented students/scholars. Although we’re not finished, each team made substantial headway in each of these areas.

On the second day, those teams continued working, but in addition we had a “re-hack” that allowed teams that were working on other topics (e.g., developing research methods syllabi, developing guidelines for reviewers, starting a new academic journal) to present their ideas and get feedback on how to make their projects/products more inclusive from the very beginning (rather than having diversity and inclusion be an afterthought as is often the case). Once again, it was inspiring to see how committed people were to making sure so many dimensions of our science become more inclusive.

These sessions, and so many others at the conference, gave me a lot of hope for the field—hope that I (and I suspect others) could really use (special shout-outs to Jessica Flake’s unconference on improving measurement, Daniel Lakens and Jeremy Biesanz’s workshop on sample size and effect size, and Liz Page-Gould and Alex Danvers’s workshop on Fundamentals of R for data analysis). It’s been a tough few years to be a scientist. I was working on my PhD in social psychology at the time that the open science collaborative published their report estimating the reproducibility of psychological science to be somewhere between one-third and one-half. Then a similar report came out about the state of cancer research – only twenty five percent of papers replicated there. Now it seems like at least once a month there is some new failed replication study or some other study comes out that has major methodological flaw(s). As someone just starting out, constantly seeing findings I learned were fundamental fail to replicate, and new work emerge so flawed, I often find myself wondering (a) what the hell do we actually know, and (b) if so many others can’t get it right, what chance do I have?

Many Big Challenges with No Easy Solutions

To try and minimize future fuck-ups in my own work, I started following a lot of methodologists on Twitter so that I could stay in the loop on what I need to do to get things right (or at least not horribly wrong). There are a lot of proposed solutions out there (and some argument about those solutions, e.g., p < .005) but there are some big ones that seem to have reached consensus, including vastly increasing the size of our samples to increase the reliability of findings. These solutions make sense for addressing the issues that got us to this point, but the more I’ve thought about and talked to others about them, the more it became clear that some may unintentionally create another problem along the way, which is to “crowd out” some research questions and researchers. For example, when talking with scholars who study hard-to-reach populations (e.g., racial and sexual minorities), a frequently voiced concern is that it is nearly impossible to recruit the sample sizes needed to meet new thresholds of evidence.

To provide an example from my own research, I went to graduate school intending to study racial-ethnic disparities in academic outcomes (particularly Black-White achievement gaps). In my first semester at the University of Michigan I asked my advisor to pay for a pre-screen of the department of psychology’s participant pool to see how many Black students I would have to work with if I pursued that line of research. There were 42 Black students in the pool that semester. Forty-two. Out of 1,157. If memory serves me well, that was actually one of the highest concentrations of Black students in the pool in my entire time there. Seeing that, I asked others who study racial minorities what they did. I learned that unless they had well-funded advisors that could afford to pay for their samples, many either shifted their research questions to topics that were more feasible to study, or they would spend their graduate careers collecting data for one or two studies. In my area, that latter approach was not practical for being employable—professional development courses taught us that search committees expect multiple publications in the flagship journals, and those flagship journals usually require multiple studies for publication.

Learning about those dynamics, I temporarily shifted my research away from racial disparities until I figured out how to feasibly study those topics. In the interim, I studied other topics where I could recruit enough people to do the multi-study papers that were expected. That is not to say I am uninterested in those other topics I studied (I very much am) but disparities were what interested me most. Now, some may read that and think ‘Neil, that’s so careerist of you! You should have pursued the questions you were most passionate about, regardless of how long it took!’ And on an idealistic level, I agree with those people. But on a practical level—I have to keep a roof over my head and eat. There was no safety net at home if I was unable to get a job at the end of the program. So I played it safe for a few years before going back to the central questions that brought me to academia in the first place.

That was my solution. Others left altogether. As one friend depressingly put it—“there’s no more room for people like us; unless we get lucky with the big grants that are harder and harder to get, we can’t ask our questions—not when power analyses now say we need hundreds per cell; we’ve been priced out of the market.” And they’re not entirely wrong. Some collaborators and I recently ran a survey experiment with Black American participants; it was a 20-minute survey with 500 Black Americans. That one study cost us $11,000. Oh, and it’s a study for a paper that requires multiple studies. The only reason we can do this project is because we have a senior faculty collaborator who has an endowed chair and hence deep research pockets.

So that is the state of affairs. The goal post keeps shifting, and it seems that those of us who already had difficulty asking our questions have to choose between pursuing the questions we’re interested in, and pursuing questions that are practical for keeping roofs over our heads (e.g., questions that can be answered for $0.50 per participant on MTurk). And for a long time this has been discouraging because it felt as though those who have been leading the charge on research reform did not care. An example that reinforces this sentiment is a quote that floated around Twitter just last week. A researcher giving a talk at a conference said “if you’re running experiments with low sample n, you’re wasting your time. Not enough money? That’s not my problem.”

That researcher is not wrong. For all the reasons methodologists have been writing about for the past few years (and really, past few decades), issues like small sample sizes do compromise the integrity of our findings. At the same time, I can’t help but wonder about what we lose when the discussion stops there, at “that’s not my problem.” He’s right—it’s not his personal problem. But it is our collective problem, I think. What questions are we missing out on when we squeeze out those who do not have the thousands or millions of dollars it takes to study some of these topics? That’s a question that sometimes keeps me up at night, particularly the nights after conversations with colleagues who have incredibly important questions that they’ll never pursue because of the constraints I just described.

A Chance to Make Things Better

Part of what was so encouraging about SIPS was that we not only began discussing these issues, but people immediately took them seriously and started working on strategies to address them—putting together resources on “small-n designs” for those who can’t recruit the big samples, to name just one example. I have never seen issues of diversity and inclusion taken so seriously anywhere, and I’ve been involved in quite a few diversity and inclusion initiatives (given the short length of my career). At SIPS, people were working tirelessly to make actionable progress on these issues. And again, it wasn’t a fringe group of women and minority scholars doing this work as is so often the case—we had one of the largest hackathons at the conference. I really wish more people were there witness it—it was amazing, and energizing. It was the best of science—a group of committed individuals working incredibly hard to understand and address some of the most difficult questions that are still unanswered, and producing practical solutions to pressing social issues.

Now it is worth noting that I had some skepticism going into the conference. When I first learned about it I went back-and-forth on whether I should go; and even the week before the conference, I debated canceling the trip. I debated canceling because there was yet another episode of the “purely hypothetical scenario” that Will Gervais described in his recent blog post:

A purely hypothetical scenario, never happens [weekly+++]

Some of the characters from that scenario were people I knew would be attending the conference. I was so disgusted watching it unfold that I had no desire to interact with them the following week at the conference. My thought as I watched the discourse was: if it is just going to be a conference of the angry men from Twitter where people are patted on the back for their snark, using a structure from the tech industry—an industry not known for inclusion, then why bother attend? Apparently, I wasn’t alone in that thinking. At the diversity hackathon we discussed how several of us invited colleagues to come who declined because, due to their perceptions of who was going to be there and how those people often engage on social media, they did not feel it was worth their time.

I went despite my hesitation and am glad I did—it was the best conference I’ve ever attended. The attendees were not only warm and welcoming in real life, they also seemed to genuinely care about working together to improve our science, and to improve it in equitable and inclusive ways. They really wanted to hear what the issues are, and to work together to solve them.

If we regularly engage with each other (both online and face-to-face) in the ways that participants did at SIPS 2017, the sky is the limit for what we can accomplish together. The climate in that space for those few days provided the optimal conditions for scientific progress to occur. People were able to let their guards down, to acknowledge that what we’re trying to do is f*cking hard and that none of us know all the answers, to admit and embrace that we will probably mess up along the way, and that’s ok. As long as we know more and are doing better today than we knew and did yesterday, we’re doing ok – we just have to keep pushing forward.

That approach is something that I hope those who attended can take away, and figure out how to replicate in other contexts, across different mediums of communication (particularly online). I think it’s the best way to do, and to improve, our science.

I want to thank the organizers for all of the work they put into the conference. You have no idea how much being in that setting meant to me. I look forward to continuing to work together to improve our science, and hope others will join in this endeavor.

Improving Psychological Science at SIPS

Last week was the second meeting of the Society for the Improvement of Psychological Science, a.k.a. SIPS[1]. SIPS is a service organization with the mission of advancing and supporting all of psychological science. About 200 people met in Charlottesville, VA to participate in hackathons and lightning talks and unconference sessions, go to workshops, and meet other people interested in working to improve psychology.

What Is This Thing Called SIPS?

If you missed SIPS and are wondering what happened – or even if you were there but want to know more about the things you missed – here are a few resources I have found helpful:

The conference program gives you an overview and the conference OSF page has links to most of what went on, though it’s admittedly a lot to dig through. For an easier starting point, Richie Lennie posted an email he wrote to his department with highlights and links, written specifically with non-attendees in mind.

Drilling down one level from the conference OSF page, all of the workshop presenters put their materials online. I didn’t make it to any workshops so I appreciate having access to those resources. One good example is Simine Vazire and Bobbie Spellman’s workshop on writing transparent and reproducible articles. Their slideshow shows excerpts from published papers on things like how to transparently report exploratory analyses, how to report messy results, how to interpret a null result, and more. For me, writing is a lot easier when I have examples and models to work from, and I expect that I will be referring to those in the future.

The list of hackathon OSF pages is worth browsing. Hackathons are collaborative sessions for people interested in working on a defined project. Organizers varied in how much they used OSF – some used them mainly for internal organization, while others hosted finished or near-finished products on them. A standout example of the latter category is from the graduate research methods course hackathon. Their OSF wiki has a list of 31 topics, almost all of which are live links to pages with learning goals, reading lists, demonstrations, and assignments. If you teach grad research methods, or anything else with methodsy content, go raid the site for all sorts of useful materials.

The program also had space for smaller or less formal events. Unconferences were spontaneously organized sessions, some of which grew into bigger projects. Lightning talks were short presentations, often about work in progress.

As you browse through the resources, it is also worth keeping in the back of your mind that many projects get started at SIPS but not finished there, so look for more projects to come to fruition in the weeks and months ahead.

A challenge for future SIPS meetings is going to be figuring out how to reach beyond the people physically attending the meeting and get the broadest possible engagement, as well as to support dissemination of projects and initiatives that people create at SIPS. We have already gotten some valuable feedback about how other hackathons and unconferences manage that. This year’s meeting happened because of a Herculean effort by a very small group of volunteers[2] operating on a thin budget (at one point it was up in the air whether there’d be even wifi in the meeting space, if you can believe it) who had to plan an event that doubled in size from last year. As we grow we will always look for more and better ways to engage – the I in SIPS would not count for anything if the society did not apply it to itself.

My Personal Highlights

It is hard to summarize but I will mention a few highlights from things that I saw or participated in firsthand.

Neil Lewis Jr. and I co-organized a hackathon on diversity and inclusion in open science. We had so many people show up that we eventually split into five smaller groups working on different projects. My group worked on helping SIPS-the-organization start to collect member data so it can track how it is doing with respect to its diversity and inclusion goals. I posted a summary on the OSF page and would love to get feedback. (Neil is working on a guest post, so look for more here about that hackathon in the near future.)

Another session I participated in was the “diversity re-hack” on day two. The idea was that diversity and inclusion are relevant to everything, not just what comes up at a hackathon with “diversity and inclusion” in the title. So people who had worked on all the other hackathons on day one could come and workshop their in-progress projects to make them serve those goals even better. It was another well-attended session and we had representatives from nearly every hackathon group come to participate.

Katie Corker was the first recipient of the society’s first award, the SIPS Leadership Award. Katie has been instrumental in the creation of the society and in organizing the conference, and beyond SIPS she has also been a leader in open science in the academic community. Katie is a dynamo and deserves every bit of recognition she gets.

It was also exciting to see projects that originated at the 2016 SIPS meeting continuing to grow. During the meeting, APA announced that it will designate PsyArXiv as its preferred preprint server. And the creators of StudySwap, which also came out of SIPS 2016, just announced an upcoming Nexus (a fancy term for what we called “special issue” in the print days) with the journal Collabra: Psychology on crowdsourced research.

Speaking of which, Collabra: Psychology is now the official society journal of SIPS. It is fitting that SIPS partnered with an open-access journal, given the society’s mission. SIPS will oversee editorial responsibilities and the scientific mission of the journal, while the University of California Press will operate as the publisher.

But probably the most gratifying thing for me about SIPS was meeting early-career researchers who are excited about making psychological science more open and transparent, more rigorous and self-correcting, and more accessible and inclusive of everyone who wants to do science or could benefit from science. The challenges can sometimes feel huge, and I found it inspiring and energizing to spend time with people just starting out in the field who are dedicated to facing them.

*****

1. Or maybe it was the first meeting, since we ended last year’s meeting with a vote on whether to become a society, even though we were already calling ourselves that? I don’t know, bootstrapping is weird.

2. Not including me. I am on the SIPS Executive Committee so I got to see up close the absurd amount of work that went into making the conference. Credit for the actual heavy lifting goes to Katie Corker and Jack Arnal, the conference planning committee who made everything happen with the meeting space, hotel, meals, and all the other logistics; and the program committee of Brian Nosek, Michèle Nuijten, John Sakaluk, and Alexa Tullett, who were responsible for putting together the scientific (and, uh, I guess meta-scientific?) content of the conference.