Solving the Reproducibility Problem in Biomedical Research

27 Feb, 2023 | Blog posts, Interviews

The ongoing debate in the scientific community about whether studies in the life sciences are perfectly reproducible is often referred to as the “reproducibility crisis in biomedical research.”

We’d like to share our view on this issue and present you with an example of how this problem is addressed in the field of reproducibility neuroscience. We reached out to Charlotte Ohonin, Neuroscientist, Founder and CEO of NORGANOID GmbH and asked her to give us insights into how she and her team approach the issue of reproducibility.

Why is reproducibility in biomedical research important?

Data on reproducibility in scientific literature is rare. Recent evidence from metastudies suggests that only between 40% to as low as 10% of the research conducted can be rated as reproducible. Also, a 2016 survey on reproducibility in research amongst 1.576 researchers has yielded surprising results. More than 70% of scientists have tried and failed to reproduce an already-conducted experiment. More than 50% could not replicate their own experiment. This underlines the importance of finding a consensus on what reproducibility means to avoid too many false findings in scientific research and the occurrence of questionable or flawed studies (Baker, 2016).

Scientists believe the root of this issue to be the pressure to publish and selective reporting. To put it simply, this reproducibility crisis narrative underlines that a large proportion of research published is unreliable due to poor process, data management and a lack of transparency. The declining quality and integrity of research and publication practices often cause reproducibility problems. Ultimately, this can result in models with low statistical power, which in turn might increase the risk of false-positive results as well as false-negatives (Baker, 2016). 

Currently, there are several tools (i.e. SciScore) available that can identify certain indicators of reproducibility e.g., transparency in research. However, they cannot be used to map and monitor these indicators across all published literature and studies in biomedical research (Serghiou et al., 2021). These tools use code that is not openly available. They might also be paid services and there is little knowledge about their  “true” performance. SciScore is based on a method known as conditional random fields to identify measures e.g., randomization, blinding, and power analysis across PubMed databases, which contain open-access literature. Doing so creates a score of rigor and transparency. But according to Serghiou et al. (2021) this tool lacks indicators of transparency such as data or code sharing, and does not give information about article-specific data, or the underlying code. Therefore, there is still the need to develop a tool to assess the different indicators of transparency across the entire open biomedical literature (Serghiou et al., 2021). 

As a result, recent concerns about reproducibility have led scientists to call for more open and transparent research practices. With thousands of new articles published each week, it is quite unrealistic to manually monitor transparency and keep traceability high (Serghiou et al., 2021). 

In biomedical science as well as in other fields, results which are not reproducible are valueless and of no significance. Therefore, a systematic approach and precise tracking of experimental procedures are required. This also involves taking potential causes of error into account, i.e., systematic and statistical errors. 

In an ideal world, analysis workflows should be described in detail, especially regarding variations in the subject selection, preprocessing, and validation practices. This helps other researchers to follow the same steps and obtain similar results within the margins of experimental error. If that is not the case, there is a risk of false or exaggerated findings always at the cost of quality (Plesser, 2018). However, there are ways to tackle this issue of reproducibility:

Factors to consider when tackling the Reproducibility crisis
Factors to consider when tackling the Reproducibility Crisis. Adapted from Drucker D. J. (2016).

The reproducibility debate in neuroscience

Like any other biomedical field neuroscience is also affected by reproducibility crisis discussions among scientists. In neuroscience in particular, the reproducibility problem is largely related to the use of multimodal data consisting of neuroimaging input and behavioral observations. Researchers strongly rely on brain-wide association studies (BWAS) to measure brain function and structure. To do so Magnetic Resonance Imaging (MRI) brain scans are used and their results are linked to a complex set of markers such as behavior, cognition, neurological conditions, mental illness, and personality traits (Marek et al., 2022). 

A study published in the prestigious Nature Magazine highlights the promising future for brain imaging in diagnostics, prognosis, and treatment response in neurological, psychological, and/or psychiatric conditions. It points out that the vast majority of BWAS studies published lack a sufficient number of participants to yield reproducible and reliable results. Upon conducting their analysis researchers concluded that relations between brain tissue structure and behavior using a sample size of 25, which is the median sample size in published studies, are not representative and thus hold lower statistical power. Only when increasing the sample size to the thousands, results could be reproduced. (Marek et al., 2022) 

MRI has transformed our understanding of the human body and brain through task functional MRI or lesion studies. New parameters related to increases in sample size for BWAS, now offer new possibilities for creating more statistically robust models while avoiding human bias. 

The researchers’ degree of freedom makes it very easy to publish studies, which are hard to reproduce, even though published results initially appear to adhere to methodological standards. 

How more effective reproducibility standards in biomedical science can be developed is still a debatable subject in the scientific community. How do you deal with this issue in your research practice? Please comment below to get the discussion going.

Reproducibility vs. Replicability

Amplified by the growing concerns over failed attempts to replicate existing positive research findings, the ongoing debate over a reproducibility crisis has been simmering for many years now. Recently, replication studies have been on the rise. Data from meta-analysis studies has especially pointed to the lack of reproducible results in biomedical research (Hunter, 2017).  

Due to the dynamic nature of scientific research, it is essential to critically assess the accuracy of scientific claims or findings. To do so, different scientific disciplines use terms such as “Repeatability”, “Reproducibility” and “Replicability”, which are sometimes used interchangeably, however, mean different things (Plesser, 2018). In previous years, several authors have attempted to solve this terminology problem. 

Jon Claerbout, a Stanford geophysicist, even though focusing on computational science, was one of the first to address the problem of reproducibility by defining these terms (Claerbout & Karrenbach, 1992). Soon other researchers followed. Based on the different coined terms, the Association for Computing Machinery (2016) adopted the following definitions:

Did you know?

Repeatability refers to the measurement, which can be obtained with stated precision by the same team using the same measurement procedure, and system, under the same operating conditions, and in the same location on multiple trials (Association for Computing Machinery, 2016).

Replicability means that a different group of researchers can yield the same results with stated precision using the same measurement procedure, and measuring system, under the same operating conditions, in the same OR a different location on multiple trials (Association for Computing Machinery, 2016). 

Reproducibility, on the other hand, means an independent group can obtain the same results using a different measuring system, in a different location on multiple trials (Association for Computing Machinery, 2016).

Whitakers matrix of reproducibility
Whitaker’s matrix of reproducibility. The image is taken from Creative Commons Attribution license (CC-BY 4.0).

It is thus necessary to be cautious about the interchangeable use of these terms and what they might refer to. 

Goodman et al. (2016) offer a review of an array of reproducibility definitions and related terminology and suggest important points to avoid misunderstandings. They imply that there is a substantial benefit when refocusing on cumulative evidence and truth. Considering that reproducibility is an important indicator of the trustworthiness of research they introduce new terminology to distinguish between the various existing interpretations of research reproducibility (Goodman et al., 2016). 

Plesser (2018) also emphasizes different types of reproducibility depending on the conditions that are being reproduced (Plesser, 2018). In general, this yields three terms, which are mainly applied to the biomedical field, however, can be utilized across many scientific domains (Goodman et al., 2016): 

  • Methods reproducibility
  • Results reproducibility
  • Inferential reproducibility

Methods Reproducibility:

If you want to reproduce your methodology and be able to repeat the same exact procedures, you need to make sure to provide enough detail about study procedures and data. 

In biomedical research this refers to the disclosure of the following information (at minimum): 

  • Study protocol
  • Details on measurement procedures
  • Data gathered
  • Data used in the analysis (including descriptive metadata)
  • Analysis software and code used
  • Final results of the analysis

In theory, this is clear, however, in practice, it is often hard to provide such a level of detail to achieve methodological reproducibility. Also, this level of detail is usually not included in publications (Goodman et al., 2016). 

In clinical science, it is vital to reach a general agreement about the level of detail used and needed in research projects and publications. This should include a description of measurements, degree of processing of the raw data, and complete reporting on analytics.

Did you know?

Methods Reproducibility: obtaining enough detail about study procedures and data to repeat the exact same workflows.

Results Reproducibility:

Results reproducibility has previously been referred to as replicability. It addresses instances in which scientists collect new data to obtain the same results through an independent study closely matching the procedures used in the original study. However, replication difficulties often arise when trying to determine criteria, based on which results can be considered similar/comparable.

Did you know?

Results Reproducibility: obtaining similar results through an independent study closely matching the procedures involved in the original one.

Inferential Reproducibility:

This is probably the most underestimated form of reproducibility, yet just as important. Inferential reproducibility means drawing qualitatively similar conclusions based on an independent study or a re-analysis of original research. It is important to note that this dimension is not identical to the other two. Researchers might come to the same conclusions using a different data sample or they may obtain different results from the same original data. Many factors contribute to such discrepancies, i.e. assessments of the prior probability of the hypotheses, and differences in how data is analyzed and reported. (McIntosh et al., 2017) 

When discussing research reproducibility and publication quality scientists are really concerned about the truth and trustworthiness of their claims. Reproducibility and its three dimensions are used to operationalize truth. This means for example that if an analysis can be repeated and it yields similar results, it is likely to be true and trustworthy. On the flip side, the body of evidence supporting a hypothesis can be increased when experiments are repeated. (McIntosh et al., 2017) 

In other words, the milestone of science is the possibility to critically evaluate the correctness of a scientific outcome or claim and/or conclusions other scientists might have drawn. It is vital to have clear conventions of how to deal with the role of chance and the levels of (un)certainty acceptable and which criteria to adopt as reproducibility benchmarks (McIntosh et al., 2017).

Did you know?

Inferential Reproducibility: drawing the same findings by applying a different research methodology to the original sample data.

Reproducibility indicators

Advancements in biomedical research and life sciences heavily rely on solid underlying data with a high degree of credibility. However, sometimes scientific outcomes are not always reproducible in a way that high-quality scientific evidence can be provided. 

A variety of factors/indicators come into play, which may affect reproducibility and might be the reason why a certain experiment design cannot be reproduced. The experimental design and settings, sample collection and preparation, experimental conditions, and other parameters influence the degree of reproducibility.

Researchers should be able to yield the same results provided that the same methods are applied to any given experiment and come to similar conclusions. This in turn helps validate and amplify the original research findings (cumulative evidence). However, way too often scientific findings cannot be exactly reproduced even though standardized parameters have been used.

The lack of reproducible results in biomedical research has vast negative effects on public health, scientific output, and progress. Consequently, this causes a decrease in the public’s trust in life sciences (Freedman et al., 2015).

A meta-analysis conducted in 2015 dealt with the cost of non-reproducible research and estimated that about 28 billion dollars per year are spent on non-reproducible preclinical research not yielding accurate results and potentially leading life science in the wrong direction  (Freedman et al., 2015; Anderson et al., 2021). We would now like to discuss some of the factors contributing to the lack of reproducibility (National Academy of Sciences, 2019):

  • Data management standards, i.e., results protocol, metadata, micro-meta, parameter database
  • Access to bioimage resources (publicly available)
  • Data collection, analysis, and reporting system
  • Standardized analysis methods and statistical procedures
  • Publication standards, including a description of methods
  • Cognitive bias
  • Publication of negative results

How NORGANOID GmbH stays on the cutting-edge of reproducible life science research: An interview with Founder and CEO Charlotte Ohonin

NORGANOID GmbH, a provider of automated organoid culture and analysis solutions, is a leading preclinical research organization. It delivers more reliable, clinically-oriented tools for disease modelling and testing of potential therapeutics, even at early stages of drug discovery. They are on a mission to challenge the status quo of the drug development flywheel to predict the outcome of drug screenings more accurately. Their unique organ-on-chip technology allows researchers to find potential cures and drugs to treat neurological disorders such as Alzheimer’s Disease or Epilepsy.

Charlotte Ohonin
Charlotte Ohonin, Founder and CEO of NORGANOID GmbH. Photo kindly provided by Charlotte Ohonin.

NORGANOID GmbH Founder and CEO Charlotte Ohonin has a long list of international publications covering topics like:

To be able to perform replicable and traceable drug development studies researchers need to be equipped with the right technical infrastructure and state-of-the-art software tools. Here’s a glimpse of how NORGANOID GmbH tackles the problem.

Charlotte Ohonin gives insights into reproducibility best practices at NORGANOID GmbH

Elisa: Hello Charlotte, it is a pleasure to have you here today. Thank you for taking the time to give us this interview. Let’s kick things off with the first question: Could you please briefly introduce yourself and explain what your role at NORGANOID GmbH is?

Charlotte: Hello and thank you for having me. I am happy to share my thoughts and experiences with you.

First, I am the founder of NORGANOID GmbH. I initiated NORGANOID already back in 2017 intending to improve 3D tissue cultures and making them accessible for different research questions in drug development and in academia.

I am also the CEO of NORGANOID GmbH. As the CEO I manage our finances and coordinate internal and external R&D projects closely with our CTO Saren Taşçıyan. Further, I am in charge of HR and involved in our Marketing & PR activities.

Charlottes’ take on the reproducibility crisis in life science

Elisa: That is amazing. So let’s dive into the topic of reproducibility now.

There is an ongoing debate in the scientific community about whether studies in the life sciences are perfectly reproducible. This is often referred to as the “reproducibility crisis in biomedical research.” So, researchers are more often concerned about whether results are reproducible or not and which sources to trust or not.

What is your view on this issue?

Charlotte: Good question. The current measure of quality research is citations and this is a problem. The quality of research results should only be measured by their reproducibility. This means that another person or group should be able to take the methods section of the paper and reproduce the results without the author’s help, optimally using devices by different manufacturers, but arriving at the same results. John Ioannidis and the Reproducibility Project: Cancer Biology are great references on this issue.

Elisa: Why do you think there is a reproducibility crisis and why does it affect so many disciplines?

Charlotte: The problems start with the incentive structure. Researchers are incentivized by funding and research institutions to publish a lot of papers, preferably in so-called “high-impact” journals. This leads to sloppy science that does not move the needle. Just recently there was a paper in Nature that discussed the decreasing disruptiveness of papers and patents over time, one reason being the unfavorable incentive structure. Furthermore, that research is not getting more transparent isn’t helping reproducibility either. Data is not shared publicly, clinical trial studies are not pre-registered, robust statistical methods are mis- or under-used, and so on. Especially in biomedicine, predominantly papers with positive results are published, which leads to a self-reinforcing loop for where the funding and resources go and not towards weeding out what results are true or just wrong.

The current measure of quality research is citations and this is a problem. The quality of research results should only be measured by their reproducibility.

Charlotte Ohonin, Founder and CEO of NORGANOID GmbH

Elisa: When was the first time in your career that you got “confronted” or faced with the issue of reproducibility and how did you deal with it then?

Charlotte: From 2015 to 2016 I set up a retinal 3D culture from human induced pluripotent stem cells (iPSC) at Leiden University Medical Center in the Netherlands. Working with iPSC can be pretty challenging sometimes. Cell lines behave differently, even though you try to treat them equally. Although I applied the same protocols to the different iPSC lines I achieved different yields of retinas per cell line. So these results were not reproducible.

To deal with the situation, I conducted the tissue generation procedures for all cell lines at the same time with the media and reagents prepared on the same days to avoid variations. Unfortunately, the results didn’t change significantly.

However, generally speaking, to achieve greater reproducibility, it is a general consensus that setting up experiments under pre-defined conditions is essential. Besides technical conditions, some variables cannot be controlled easily i.e. biological behaviors (in the case of stem cells) and human handling which make reproducibility a remaining challenge.

Elisa: How has your commitment to reproducibility as a researcher and scientist changed or evolved over the years (also considering when your academic experience deepened)?

Charlotte: Reproducibility is an ever-present and important topic my team and I are dealing with daily. To succeed with our goals of providing a platform that independently produces tissues from stem cells, it is vital to guarantee standard procedures to acquire identical results at any time when performing a particular task. So, we put a great emphasis on providing reproducible data.

Especially in biomedicine, predominantly papers with positive results are published, which leads to a self-reinforcing loop for where the funding and resources go and not towards weeding out what results are true or just wrong.

Charlotte Ohonin, Founder and CEO of NORGANOID GmbH

On handling the issue of reproducibility

Elisa: Please take us through how you tackle the issue of reproducibility in your current research.

Charlotte: We try to tackle this issue by standardizing the process of 3D cell cultures with automation. That includes the integration of the IKOSA Platform into our system, too. This way we take the human factor off the process. Every step of the process can be programmed as a protocol, which can be repeated on different devices in another lab. This alone definitely will not solve the reproducibility crisis, but it is a small step forward.

Elisa: How do you know if you can trust particular results of a study when doing your own literature review? What criteria do you look at or which factors signify reproducibility to you?

Charlotte: We would look at which methods the research builds upon. If it uses methods that have been used by other groups and uses the methods to reproduce the original results that are a plus for this paper. After using the method to produce their own results, if they then use an orthogonal method to see if their results hold true, then this is even better. If the authors also make their data, materials, and software directly available, this makes the paper even more trustworthy to us.

On the statistical side of things, we would look at the effect size and p-value of their results and calculate their statistical power a posteriori. If the p-value is >0.01 and the power is <90% we assume that their sample size was too small to give a significant result. This would make us skeptical of their results.

Reproducibility is an ever-present and important topic my team and I are dealing with daily. (…) We try to tackle this issue by standardizing the process of 3D cell cultures with automation. That includes the integration of the IKOSA Platform into our system, too.

Charlotte Ohonin, Founder and CEO of NORGANOID GmbH

On best practices for reproducibility in life sciences

Elisa: If you could publish the “ideal” paper, what would you need in terms of it to be a fully reproducible paper or research project?

Charlotte: We would follow the recommendations by John Ioannidis and colleagues:

  • Protect against cognitive biases, i.e. blinding, randomization
  • Independent methodological support, i.e. involvement of methodologists and independent oversight
  • Collaboration with other researchers and teams across different disciplines
  • Preregistering the research project ideally as a registered report
  • Open data, materials, software, and so on to increase the project’s transparency
  • Publishing the results as pre-print in order to get public review not only journal-based peer-review

Elisa: What advice would you give to a young researcher in biomedical sciences just about to start his or her career concerning reproducibility?

Charlotte: If you are just starting out and deciding on a Master’s thesis or Ph.D. thesis topic, look for recent papers that you enjoyed reading and try to replicate and reproduce the paper’s methods and results. You will learn new methods if the replication fails, you will know how not to do science, and if you publish your results, you will positively contribute to scientific practice.

Elisa: Great advice. Which research paper of your own work would be a best practice example in terms of reproducibility?

Charlotte: We haven’t published our work on automated tissue engineering yet, but are planning to do once we have enough data to present. Previous publications mainly focus on modeling disease with stem cells rather than talking about reproducibility. So soon there will be a great example from our side out there.

If you are just starting out and deciding on a Master’s thesis or Ph.D. thesis topic, look for recent papers that you enjoyed reading and try to replicate and reproduce the paper’s methods and results.

Charlotte Ohonin, Founder and CEO of NORGANOID GmbH

On Women in Science and balancing two roles: career woman and motherhood

Elisa: Women scientists are leading ground-breaking research across the globe. However, according to the numbers of the Unesco Science Report, women still only represent 33,3% of researchers globally despite their remarkable discoveries.

As a woman yourself and a young mother, what advice would you give other women interested in biomedical sciences and AI, also regarding facing obstacles such as reproducibility?

Charlotte: I strongly believe that if there is a passion, there is a way of bringing that fascination into reality. However, it takes a lot of commitment and openness to work toward that particular interest. In my opinion, everyone should dare and at least learn from those experiences regardless of gender. You have to practice an idea, a concept, or a method to tackle reproducibility issues and you should collaborate and exchange with other researchers to achieve your goal.

Elisa: Women in science frequently question themselves about how they could successfully reconcile family life or motherhood with a scientific career. Motherhood is also often referred to as the “Elephant in the Lab” and yet many female scientists are showing that a successful career and being a mother do work.

What is your experience as a scientist and mother? What advice would you give to other female scientists and mothers-to-be, especially in the field of biomedical science?

Charlotte: There is always a challenge in life that forces you to adapt and transform. Life truly is a journey. Creating a family is one of those challenges but also a wonderful thing that is often neglected in such discussions. Sure, it is everything but easy. However, a lot of other women have already successfully shown that having a family doesn’t necessarily limit your chances of having a fulfilling scientific career. Sometimes the opposite is true, as it can be a career enabler. Women managing careers in general and their families at the same time are much aware of what they are doing. Every individual knows best what is worth working for. While it may be challenging at times, balancing the roles of being a mother and a career woman is not impossible.

Elisa: Thank you! It was a pleasure to learn your point of view on these challenging subjects.

Yield unbiased and reproducible analysis results with AI-automation.

Our authors:

KML Vision Team Elisa Opriessnig Content writer

Elisa Opriessnig

Content writer focused on the technological advancements in healthcare such as digital health literacy and telemedicine.

KML Vision Team Fanny Dobrenova Marketing Specialist

Fanny Dobrenova

Health communications and marketing expert dedicated to delivering the latest topics in life science technology to healthcare professionals.


Anderson, J. M., Wright, B., Rauh, S., Tritz, D., Horn, J., Parker, I., Bergeron, D., Cook, S., & Vassar, M. (2021). Evaluation of indicators supporting reproducibility and transparency within cardiology literature. Heart (British Cardiac Society), 107(2), 120–126.

Association for Computing Machinery (2016). Artifact Review and Badging. Available online at: 

Baker M. (2016). 1,500 scientists lift the lid on reproducibility. Nature, 533(7604), 452–454.

Begley, C. G., & Ioannidis, J. P. (2015). Reproducibility in science: improving the standard for basic and preclinical research. Circulation research, 116(1), 116–126.

Bryan, C. J., Yeager, D. S., & O’Brien, J. M. (2019). Replicator degrees of freedom allow publication of misleading failures to replicate. Proceedings of the National Academy of Sciences of the United States of America, 116(51), 25535–25545.

Claerbout, J. F., and Karrenbach, M. (1992). Electronic documents give reproducible research a new meaning. SEG Expanded Abstracts 11, 601–604. doi: 10.1190/1.1822162

Drucker D. J. (2016). Never Waste a Good Crisis: Confronting Reproducibility in Translational Research. Cell metabolism24(3), 348–360.

Freedman, L. P., Cockburn, I. M., & Simcoe, T. S. (2015). The Economics of Reproducibility in Preclinical Research. PLoS biology, 13(6), e1002165.

Goodman, S. N., Fanelli, D., & Ioannidis, J. P. (2016). What does research reproducibility mean?. Science translational medicine, 8(341), 341ps12.

Hunter P. (2017). The reproducibility “crisis”: Reaction to replication crisis should not stifle innovation. EMBO reports, 18(9), 1493–1496.

Marek, S., Tervo-Clemmens, B., Calabro, F. J., Montez, D. F., Kay, B. P., Hatoum, A. S., Donohue, M. R., Foran, W., Miller, R. L., Hendrickson, T. J., Malone, S. M., Kandala, S., Feczko, E., Miranda-Dominguez, O., Graham, A. M., Earl, E. A., Perrone, A. J., Cordova, M., Doyle, O., Moore, L. A., … Dosenbach, N. U. F. (2022). Reproducible brain-wide association studies require thousands of individuals. Nature, 603(7902), 654–660.

McIntosh, L. D., Juehne, A., Vitale, C. R. H., Liu, X., Alcoser, R., Lukas, J. C., & Evanoff, B. (2017). Repeat: a framework to assess empirical reproducibility in biomedical research. BMC medical research methodology, 17(1), 143.

National Academies of Sciences, Engineering, and Medicine; Policy and Global Affairs; Committee on Science, Engineering, Medicine, and Public Policy; Board on Research Data and Information; Division on Engineering and Physical Sciences; Committee on Applied and Theoretical Statistics; Board on Mathematical Sciences and Analytics; Division on Earth and Life Studies; Nuclear and Radiation Studies Board; Division of Behavioral and Social Sciences and Education; Committee on National Statistics; Board on Behavioral, Cognitive, and Sensory Sciences; Committee on Reproducibility and Replicability in Science. Reproducibility and Replicability in Science. Washington (DC): National Academies Press (US); 2019 May 7. 3, Understanding Reproducibility and Replicability. Available from: 

Plesser H. E. (2018). Reproducibility vs. Replicability: A Brief History of a Confused Terminology. Frontiers in neuroinformatics, 11, 76.

Serghiou, S., Contopoulos-Ioannidis, D. G., Boyack, K. W., Riedel, N., Wallach, J. D., & Ioannidis, J. P. A. (2021). Assessment of transparency indicators across the biomedical literature: How open is open?. PLoS biology, 19(3), e3001107.


Join our newsletter