Enhancing Reproducibility

  1. 1.  Lehman College, CUNY

Reproducibility is the cornerstone of scientific research. A lack of reproduced data causes knowledge gaps, skepticism towards scientific research and an inability to subject hypotheses to changing conditions thereby minimizing their acceptance especially beyond the immediate scientific community (Jarvis and Williams 2016). In modern scientific research however, reproducibility studies have often failed. For example, Begley and Ioannidis (2015) describe that Bayer HealthCare Germany was unable to reproduce 65% of the studies it attempted to recreate. An even more dismal 89% failure rate was reported by Amgen and the M.D. Anderson Cancer center (Begley and Ioannidis 2015). The reasons cited for this include sloppy initial research, bad statistical analysis and often outright forgery of data. There is also the issue of incentives, and that will be the focus of this proposal.

As noted by Alberts et al (2015), reproducibility is not viewed as a priority in the field of science from a professional or financial standpoint. The pressure to publish novel findings, year after year, is endemic to the scientific community. In this environment, it is simply not a viable practice to either publish data which disappoints or to spend time on reproducibility which would take time away from working on new studies. This leads to a situation where a reproducibility crisis is almost inevitable. To reverse it will need a re-ordering of incentives. Ioannidis (2014) commented upon the issue of incentives, noting that the drivers of research are chiefly the ability to publish and/or profit and in this regard that external corporate pressures will continue to ensure that these will be the primary drivers of science. Working within this framework, rather than overturning it, seems to be a realistic and fruitful endeavor.

To that end, and with the understanding that agencies such as the NIH set guidelines that many top institutions follow, I turn to and build on Collins and Tabak (2014). In their paper, Director Collins and principal deputy director Tabak propose several promising venues such as the use of training modules and checklist. The one proposal that is perhaps most intriguing and that this paper seeks to flesh out and build on is the idea of a Data Discovery Index (DDI) which would house unpublished, primary data. If that data is used in a published paper, the originator of the information would receive publication credit.

In this proposal, the DDI would be tweaked in two ways. First, it would be flagged so that if the database contained information that was useful or verified- even if the results were not published- the originator of the data would receive credit. Perhaps, instead of publications, a new CV line, such as Verifications will be created. This is in line with Collins and Tabak (2014)’s idea, as expressed in the founding NIH document that this project “foster the wide-scale collaborations and partnerships between and among key stakeholders.”  For instance:

Boka Z. (2016) Using computer based therapy to treat pragmatics related deficiencies in adolescent TBI survivors. Verified by Smith G. and Nadel R. Ohio State University 24 May 2016.

What this tweak would do is give investigators recognition for having their data verified by an external source. It would also of course work on the receiving end, perhaps with a line such as:

Smith G. (2016) Verified via DDI “Using computer based therapy to treat pragmatics related deficiencies in adolescent TBI survivors.” Data by Boka Z., Lehman College submitted 3 April 2016.

This would allow people to receive resume recognition for work that is indispensable to science but which no mechanism currently validates. It would also legitimize and professionalize the verification process: Baker (2016) identifies a key stumbling block with regards to reproducibility, namely that researchers are hesitant to contact each other to assist with reproducibility studies because they risk sounding naïve or accusatory: By integrating reproducibility into the CV framework, the air of illegitimacy that Baker (2016) identifies as haunting researchers as they attempt to breach the subject would be lifted. At the same time, since verification would become more mainstream, researchers would have fewer reasons not to participate. Of course this would not impact the myriad road blocks posed by, for example, confidentiality agreements, but it would make the subject an easier one to formalize and legitimize.

 A second proposal is related to funding. Presently, there is no mechanism that I am aware of whereby data collection and verification is independently funded. My proposal is to create a verification grant category within the NIH-DDI framework which would hopefully serve as a model for other funding agencies and sources. In this scenario, information, once it has been independently verified, is financially rewarded. The reward would be a reasonable sum- perhaps up to $5,000 or thereabout: Enough to be meaningful but not enough to risk misconduct over.

In order to prevent abuse- mindful of the age old injunction that the love of money is the root of all evil- this financial reward could be earned no more than twice a year by any one investigator and may not be earned more than two years in a row. Further, the person verifying the data would not be eligible for a financial reward: This would make the process somewhat one sided but would disinvite collusion between researchers whereby one would submit information, one would verify and both would share in the wealth regardless of the quality or verifiability of the data. Since researchers can both submit and verify information on DDI, this incentive does not preclude anyone from potentially benefiting from this system. It would just act as a safety measure to prevent gaming the system.

By bringing verification into the professional mainstream and incentivizing those who seek verification the NIH can set an example for other granting agencies to follow. Perhaps, in a few short years, the verification crisis will have lessened.


 Alberts, B., Cicerone, R. J., Fienberg, S. E., Kamb, A., McNutt, M., Nerem, R. M., ... & Zuber, M. T. (2015). Self-correction in science at work: Improve incentives to support research integrity. Science, 348(6242).

Baker M. (2016) 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452-454.

Begley, C. G., & Ioannidis, J. P. (2015). Reproducibility in science improving the standard for basic and preclinical research. Circulation research, 116(1), 116-126.

Collins, F. S., & Tabak, L. A. (2014). NIH plans to enhance reproducibility. Nature, 505(7485), 612.

Ioannidis, J. P. (2014). How to make more published research true. PLoS Med, 11(10), e1001747.

Jarvis M.F., Williams M. (2016). Irreproducibility in preclinical biomedical research: perceptions, uncertainties, and knowledge gaps Trends Pharmacol. Sci. 37, 290–302.



This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.