The evolution of evaluation: Richard Sever on the future of peer review

The Publication Plan

1 year ago

Peer review is fundamental to the evaluation of biomedical research, ensuring the rigour and credibility of published scientific findings. However, the system is under mounting pressure due to the sheer volume of research being conducted, and the quality and timeliness of research evaluation is increasingly at stake. Richard Sever, co-founder of the bioRxiv and medRxiv preprint servers, is at the forefront of efforts to innovate in this space. We spoke with Richard to discuss his vision for the future of peer review, exploring how preprints and evolving evaluation methods might address the challenges facing scientific publishing today.

You recently participated in a session on the future of peer review at the ISMPP Annual Meeting. Do you believe that the existing peer review model effectively meets the needs of the scientific community, particularly in biomedical and clinical research? If there is room for improvement, what are the main deficiencies of the current system and what can be done to address them?

“I do think there’s room for improvement. When we say peer review, often what we mean is a broader picture that includes the editorial and administrative checks that a journal does, as well as the formal review by peers. That’s where things vary a lot – there are some journals that are incredibly responsible and do a very good job, and we know that there are some where it’s peer review in name only, most obviously the predatory journals. But there’s a spectrum, so there’s a lot of opportunity to improve the process. Part of that might be making different choices for different types of article. For example, for papers where there’s patient involvement, there needs to be far more stringent scrutiny than for a basic research paper. Patient consent for publication, deidentification of patient data, you can’t really expect peer reviewers to do those kinds of checks; you expect the journal to do them. In recent years, I’ve become more concerned about these editorial checks than peer review per se, because opinions will differ on the quality of manuscripts and its clearly not the case that the three people who peer review a paper are a representative sample of everybody who could review it; however, the integrity checks that a journal performs may ultimately be more important. Different journals cover different subjects though, so maybe they can approach things differently. A journal dealing with a high volume of basic research papers, for example, may not need to worry as much about certain checks. This is where we start considering the benefits of peer review, and in some cases, it may be better done after publication, leading to a more multidimensional, ongoing process. On the other hand, for a vaccine study, you may want a very thorough peer review before it goes out into the world, depending on the results.”

“…there’s a lot of opportunity to improve peer review. Part of that might be making different choices for different types of article.”

You co-founded the medRxiv preprint server for health science research in 2019. How and where do preprint servers fit into the existing peer review model? Has that positioning evolved in the years since medRxiv was launched?

“The clear thing about preprint servers is that they’re decoupling research dissemination from research evaluation and specifically from peer review evaluation. What has become very clear both in the basic science space and in the clinical space is that you can do this so long as you responsibly put out preprints and make it clear that these are authors’ claims and they have not been verified. This is a good thing, because it acclimatises people to the fact that science can be a bit messy and just because somebody has put something out there, it doesn’t mean it’s necessarily valid. Preprints have demonstrated that you can do this decoupling, which then allows us to have a conversation about what the evaluation should look like. There are checks you can do very quickly at a preprint server: Does this paper look like it’s completely plagiarised? Does it seem completely unreasonable? Once those checks are done and the article is online, there’s more time to do a thorough review with less pressure. This is where the real opportunity lies for journals, and indeed new organisations that want to do peer review differently, to say “OK, the paper is out there, we are now going to evaluate it. Can we evaluate it in a better way because we haven’t had to rush the evaluation as the dissemination has already been achieved”.

“Preprints have demonstrated that you can decouple research evaluation from research dissemination.”

“In the 10 years since bioRxiv launched, we’ve had many different fields embracing this process and people understanding that you have to read the paper yourself; you can’t just take its conclusions on trust. It’s concentrated people’s minds in that respect, because we can all point to papers that apparently underwent ‘peer review’ but we’re aghast that they somehow made it through. What’s interesting is that the existence of bioRxiv is allowing people to begin to experiment with peer review. You now have organisations like Review Commons and Peer Community In, which are not journals; they are peer review services that operate based on the fact that there is already a preprint out there on bioRxiv or medRxiv.

The other thing we’ve certainly found at medRxiv is that you have to do this responsibly. There’s a small number of papers where the findings might influence public behaviour and we say these should go through peer review before dissemination, but that’s not true of 99% of clinical papers. That’s part of medRxiv’s initial screening, the obvious example being a paper claiming a life-saving treatment or vaccine was dangerous and a consequence of its dissemination could be that a lot of people stop taking the treatment – that would be a problem and we wouldn’t post it. But most papers aren’t in that category, and in the clinical space, the pandemic showed that epidemiology could be disseminated as preprints with huge benefit. For example the RECOVERY Trial showing dexamethasone was an effective treatment for severe COVID came out as a preprint on medRxiv many weeks before it appeared in the New England Journal of Medicine.”

Thinking specifically about pharmaceutical industry-sponsored biomedical research, how have pharmaceutical companies embraced the use of preprint servers for disseminating their research findings? Speed of dissemination of preprints was a notable benefit during the COVID-19 pandemic. What are the other motivations for industry to use preprint servers for research dissemination?

“To the credit of the pharmaceutical industry, some of them are trying to figure out whether this is something they can or should do. We did get industry-supported papers showing the effectiveness of the COVID vaccines against different variants and that type of thing during the pandemic. So industry can and should make use of preprint servers. Part of the hesitation is this question of ‘safe harbour’ and what seems not quite resolved in everybody’s minds is whether pharmaceutical companies can put out these sorts of studies under safe harbour. The preclinical studies, the very basic research, I think they’re happy with, but some people in the pharmaceutical industry are worried that if they put out a paper that seems to show a clinical effect as a preprint, then they might be accused of trying to use the preprint server as a way to get around peer review and get out publicity claiming that a treatment works.

Speed of dissemination is the number one motivation for using a preprint server; another motivation is that you can revise preprints. So you can put out a preprint, get some comments, and improve it so that when you do send it to a journal, it’s in much better shape. A lot of people have observed that their papers have had easier rides through peer review journals because they’d ironed out some of the kinks after getting feedback on the preprint. There may also be some papers where you’re just getting some information out there, a follow-up work, for example, that doesn’t need formal peer review, and this will instead come in the community discussion that happens afterwards. I think that’s a debate among the scientific and clinical community as to what percentage of papers fall into that category.”

What are the primary challenges associated with the submission of industry-sponsored research to preprint servers. There can often be considerations relating to proprietary data, regulatory considerations, and potential for misinformation when thinking about disseminating clinical studies for instance. How can these challenges be addressed?

“This is why I think it’s important that preprint servers have screening to eliminate or minimise the possibility of misinformation. There is a difference between a responsibly operated server like medRxiv and some databases that don’t screen at all. It’s also why we have more stringency in our screening checks on medRxiv than bioRxiv, because of these kinds of concern.

One of the benefits of the preprint server is that it doesn’t claim to have verified the information. I’m far more concerned about misinformation that appears in journals where there is a claim that the information has been peer reviewed, so a journalist then comes across it and assumes it’s been peer reviewed so it must be right. I often joke that the papers that claimed that COVID came from 5G towers were in so called peer-reviewed journals, and not preprints. If that sort of thing came into medRxiv, we wouldn’t post it.”

“I’m more concerned about misinformation that appears in journals where there is a claim that the information has been peer reviewed.”

Preprint review is gaining traction as an approach to evaluating scientific research before formal journal publication, and you’ve mentioned the advantages of decoupling research evaluation from dissemination. How best do you think preprint review can complement traditional journal peer review?

“One obvious way is that a journal that’s doing traditional peer review can factor in the other evaluations that are going on. Review Commons is an interesting example in that you post a paper on bioRxiv, then you can go to Review Commons, who will do the peer review, and then you can take those peer reviews to a journal. There’s also the approach that one of the PLoS journals took, where they were actively looking at comments sections of preprints and taking the discussion into account in their peer review evaluation. I would certainly do that if I were an editor – if you’re getting two or three people’s peer reviews of a paper but there’s lots of discussion about that paper online that seems well-informed, then of course you’d want to factor that into your judgement. In the early days of Twitter, there were a lot of very good discussions of scientific papers – it’s become more polluted in recent years – and that demonstrated the potential for self-organised research evaluation. We shouldn’t lose sight of the fact that that’s what we really mean by peer review. Sometimes we think of peer review as a very formal process done over a period of weeks operated by a journal, but really in the scientific sense, peer review is the scientific community discussing and evaluating work and debating its significance. So it all comes back to this idea of decoupling of research evaluation from dissemination and asking how can we do the evaluation better.”

“We shouldn’t lose sight of the fact that that’s what we really mean by peer review… …the scientific community discussing and evaluating work and debating its significance.”

Thinking about a decoupled approach to research evaluation, what do you think about a model whereby the medical societies commission their own peer reviews instead of the traditional journal peer review approach?

“One of the questions I would ask if you were a scientific or medical society considering creating a new journal tomorrow and you knew that all the papers were going to be on bioRxiv or medRxiv, is what’s the point in hosting the papers on a website if they’re already on a preprint server? You can just do the review part. This gets back to a phrase that some people have used to describe the future: Publish, Review, Curate. Scientific societies would be a perfectly positioned to do that – they have the expertise, and they are seen as working in the interests of the scientific community. The challenge as with so much of publishing is the business model and who pays, but that’s a challenge the entire industry is facing. At least the decoupling means that you don’t have to pay for the hosting and putting the papers online because that’s already been done.”

We recently featured a piece on eLife’s ‘reviewed preprint’ model and the journal’s experience from the first year, with faster research dissemination without a reduction in quality. Do you see eLife’s model as a blueprint for the future of biomedical publishing?

“The interesting thing about the new eLife model is that it confronts this issue of peer review being a seal of approval. The worry has always been that you send your paper to, say, the New England Journal of Medicine, they don’t think it’s good enough to publish, and so you just go down the chain until ultimately your paper gets published somewhere – it gets a ‘tick’ saying it’s peer reviewed. Does that mean it’s correct or good enough to publish? Clearly the journals higher up the chain didn’t think so. What the eLife model does is explicitly say peer review is a process, not a judgement. You go through eLife peer review, you get peer reviews, and those peer reviews might say the evidence basis is not sufficient for the claims made. In other words, what they mean by ‘peer reviewed’ is that there are peer reviews for this paper, not that they have decided to give the paper a tick or endorsement. It’s a very interesting – and polarising – idea, because it makes people consider the difference between peer review as a process and peer review as a certification. Again this comes back to the view that peer review doesn’t need to be the same for all papers. I could see large swathes of basic science operating like this and clearly some of the funders seem to be thinking along these lines. I find it harder to see it working for clinical research, because there I think people do feel like they want some kind of judgement as to the veracity of the work. So I’d be less likely to predict success of the eLife model in the clinical space. It probably only works if you ensure the Curate part of the Publish, Review, Curate model – there’s too much for people to read and they want a signal as to whether they should read something.”

“What the eLife model does is explicitly say peer review is a process, not a judgement.”

It’s inevitable with innovative approaches like preprints and preprint peer review that people can have some misconceptions and scepticism. Are there any misconceptions you would like to dispel?

“The notion that preprints and preprint servers are all incredibly irresponsible and it leads to all this misinformation coming out – that’s not true. That’s why we have screening and these ‘do no harm’ rules. When I look back at the pandemic as an example of this, I don’t see any big errors that were made by bioRxiv and medRxiv. I do see a lot of errors that were made at journals – the Surgisphere papers for example or papers that said COVID came from outer space. These sort of things were not coming out on bioRxiv and medRxiv. The infamous paper by Didier Raoult on hydroxychloroquine did appear as a medRxiv preprint, but within 24 hours of that it appeared in a journal as well, and that was the thing that everybody was pointing to. I wouldn’t want to blame any physicians, but in the fog of war, anecdotal reports of hydroxychloroquine, etc. meant there was a problem with misinformation there, but I don’t think we should point the finger at preprints for it.”

“The notion that preprints and preprint servers lead to misinformation coming out is not true.”

What other innovative approaches should we be considering to evolve the peer review process?

“I think you could have a number of different stages of review – so decoupling things even further and saying, for example, the person who looks at the statistics in a paper need not be the same person who looks at the biology. So we might get to a point where we can say somebody’s checked a dataset, somebody’s looked at the crystal structure, somebody’s looked at the stats, etc. – and peer review evolves to be more of a constellation of trust signals in which individual elements of the paper have been verified. This could be particularly important for multidisciplinary studies where it’s conceivable that no one person could read and understand the whole paper. More generally, we should acknowledge we have been far too dependent on papers as the indicator of somebody’s scientific contribution. There are people who write code, people who create databases and data resources, for example, and we should understand that the peer-reviewed paper is part of a broader constellation of academic outputs, some of which may never produce ‘papers’.

We could also consider the idea of separating out the technical checks of a manuscript from a contextual review, and maybe those things can be carried out by different people. That way we could involve more people in the peer review process. It’s frequently noted that the peer review process is buckling and straining and there aren’t enough peer reviewers, but there are lots of younger scientists who want to peer review papers, and maybe they can do some of the technical review and maybe the more experienced heads do some more contextual review.”

Can artificial intelligence (AI) help in the peer review process, or might it cause more problems?

“The short answer is both. It’s very clear that AI can help; we all use spelling and grammar checks, and particularly for non-native English speakers, the use of large language models to help improve their English seems like a no-brainer. There are lots of useful time-saving tools, but from the author’s perspective, you can’t take any of their outputs on trust. We’re happy to have ChatGPT help write your paper, but you should read what it’s written and make sure that you agree with it, because ultimately you as the author are responsible for the content. On the flip side, undoubtedly AI will be used by bad actors to try and fake stuff, and I think a lot of publishers are talking about the notion of an arms race between the papermills and the publishers as people try to identify content that is entirely automated and fake as opposed to things that have undergone language polishing or used a tool that helps you process your data.”

Reflecting on the journey of bioRxiv and medRxiv, what have been the most surprising or significant lessons learned about the role of preprints in scientific publishing?

“I don’t know if it was a surprise, but one thing that was very striking was the rapid adoption of medRxiv during the pandemic. There’s that saying “If you build it, they will come”, which I’m always very dismissive of because I see so many examples where people built things they thought were great and nobody came. But one of the lessons was that scientists do adopt things when they see clear benefits for themselves and the community. They were very quick to adopt email, for example, but less quick to adopt electronic notebooks. The experience with bioRxiv was that once people figured out what it was doing, a lot of them became converts because they saw it as a huge benefit to themselves as individuals, and also the community. We anticipated that medRxiv would have a slow adoption phase over five years or so before anybody really used it; then came the pandemic. We launched medRxiv in 2019 and we certainly hadn’t told anyone in China about it, but by Spring of 2020 when the pandemic started, we were getting dozens of papers every day from China. So it was amazing to see this brand new thing that didn’t exist even a year before the pandemic, suddenly have 10 million people looking at it every month.”

“It was amazing to see this brand new thing that didn’t exist even a year before the pandemic, suddenly have 10 million people looking at it every month.”

Finally, what is your vision for the future of peer review in medical publishing? It’s been just over ten years since the founding of bioRxiv. How do you see the landscape evolving over the next decade?

“What I would really hope – and we’re beginning to see signs of this – is that the funders of research see that preprints are a really easy way to address a problem that they’ve been trying to solve for 20 years: how to provide public access to research. We’ve talked about peer review and its complexity, but the challenge of public access is one that we can solve really easily by funders just saying, “Post a preprint”. That could solve the problem tomorrow. Some funders are getting close to that, like the Chan Zuckerberg Initiative, and the Michael J. Fox Foundation, and actually the Bill & Melinda Gates Foundation are now taking this kind of approach. So that would be my number one hope: that this solves the access problem.

“Preprints are a really easy way to address the problem of providing public access to research.”

The other thing I’d love to see a lot more of is experiments in peer review – both by journals and self-organised communities. There’s a real opportunity for everyone involved to decide how can we do peer review better. Decoupling will also hopefully get us away from conflation of questions like Should I read this paper? Is this person good? Is this work of general interest? This is currently all conflated in assumptions based on the journal where the paper appears, but you can have great work that’s not in the top journals and things that are really important aren’t necessarily of broad general interest. A post-preprint ecosystem is an opportunity to try and get away from the conflation.”

Richard Sever is Assistant Director of Cold Spring Harbor Laboratory Press, and the co-founder of bioRxiv and medRxiv and can be contacted via LinkedIn.

—————————————————–

How do you perceive the current state of the peer review system in biomedical research?

It’s effective and meets the needs of the scientific community

There are some challenges, but it still functions adequately most of the time

It’s under significant strain and needs substantial improvements

It no longer works and we need a complete overhaul

Share this: