Investigating the impact of trial retractions on the healthcare evidence ecosystem (VITALITY Study I): retrospective cohort study
BMJ 2025; 389 doi: https://doi.org/10.1136/bmj-2024-082068 (Published 23 April 2025) Cite this as: BMJ 2025;389:e082068Linked Editorial
Retracted studies in systematic reviews and clinical guidelines
Linked Opinion
Problematic trials are contaminating the evidence ecosystem

All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
Dear Editor,
Xu et al.’s VITALITY study rightly highlights the structural risks that retracted RCTs pose to meta-analyses and clinical guidelines. This is a crucial first step in illuminating the fragility of our current evidence ecosystem.
Yet the core issue lies not only in whether a trial is retracted, but in the very architecture by which evidence is constructed. From a phenomenological perspective, current RCT paradigms risk becoming overly formalistic and insufficiently attuned to the lived complexity of patient experience.
The REAL study offers a compelling contrast. Targeting individuals with complex mental health needs, it assessed not only the efficacy of interventions but also recovery-oriented outcomes, social functioning, and cost-effectiveness—demonstrating that richly contextualised designs can yield clinically relevant insights.
We must move beyond the idea that a “trustworthy RCT” or “high-quality meta-analysis” can be judged solely by formal criteria. Design coherence—alignment among question, context, and method—should be treated as a central metric of evidence validity.
Future GRADE and systematic review frameworks therefore need an additional dimension: not only detecting retracted trials, but evaluating the structure of meaningful inquiry itself. As Benner and Wrubel argue, healthcare is not merely technical intervention but an “attunement” to the patient’s world.¹ Without integrating such human-centred perspectives, we risk producing evidence that is technically robust yet epistemically hollow.
The call is thus not just for stricter policing of misconduct, but for more meaningful science.
Sincerely,
Kenjiro Shiraishi
Tanashi Kitaguchi Acupuncture and Moxa Clinic, Tokyo, Japan
Nozaki Building 301, 2-9-6 Tanashi-chō, Nishi-Tokyo-shi, Tokyo 188-0011
ORCID: https://orcid.org/0009-0003-2550-7385
References
1. Benner P, Wrubel J. The Primacy of Caring: Stress and Coping in Health and Illness. Menlo Park, CA: Addison-Wesley; 1989.
2. Killaspy H, et al. The Rehabilitation Effectiveness for Activities for Life (REAL) study. NIHR Journals Library; 2017.
3. Neumann I, Santesso N, Guyatt G, et al. Core GRADE 1: overview of the Core GRADE approach. BMJ 2025;389:e081903. doi:10.1136/bmj-2024-081903
Competing interests: No competing interests
Dear Editor
Xu et al. investigated the impact of retracted trials on the production and use of healthcare evidence (1). Randomised controlled trials (RCT) are important for clinical practice, and a meta-analysis of RCT was sometimes affected by the data of retracted papers. Retracted trials have an impact on the evidence ecosystem, including evidence synthesis, clinical practice guidelines, and evidence based clinical practice. Although the authors mentioned that users should pay attention to this problem, but no actions can be made by the users. I have a comment.
Quantitative data, including pooled effect size, would be changed by the extraction of retracted papers. But users cannot understand the retracted papers in advance. RCT studies, which are useful for understanding a specific relationship or exploring good clinical predictor of prognosis, have also been used as a component of a meta-analysis. Each component of a meta-analysis can be constantly updated by considering the quality of papers. Epidemiological data from RCT trials present a trend of association, and users should understand the limitations of the healthcare evidence ecosystem. Retraction of papers cannot be prevented completely, and it is considered as a part of scientific publication (2). Instead, scientific misconduct should be avoided (3), and the reason of retractions should be monitored for using the contents of scientific papers.
/References
1. Xu C, Fan S, Tian Y,et al. Investigating the impact of trial retractions on the healthcare evidence ecosystem (VITALITY Study I): retrospective cohort study. BMJ 2025;389:e082068.
2. Editorial. Retractions are part of science, but misconduct isn't - lessons from a superconductivity lab. Nature 2024;628(8009):689-690.
3. Bouter L. Tackling research misconduct. BMJ 2024;386:q1595.
Competing interests: No competing interests
Dear Editor
We agree that examining the impact of retracted trials on systematic reviews very considerably underestimates the impact of problematic studies on healthcare evidence. Retracting untrustworthy, problematic academic publications can take years, and most often never occurs. The 2023 House of Commons’ Science, Innovation and Technology Committee’s Reproducibility and Research Integrity report stated that academic publishers ‘should commit to timely publication of research error corrections and retractions … this process should not take longer than two months’ (1).
For example, we investigated 172 clinical trials from one group (2), simultaneously notifying serious integrity concerns to all journals and publishers in July 2019. The concerns were similar for all trials, including evidence that random allocation of participants could not have produced the treatment groups reported, evidence that the distribution of numbers of participants withdrawing from the trials was implausible, frequently contradictory reporting of the size of participant populations, implausibly prolific research activity, unethical conduct, and very frequent discrepancies between trial registration documents and journal publications for study conduct, study location, participant age and participant number (including wholesale updating of trial registration documents after we notified these concerns to journals and publishers).
More than 5 years later, only 22 of the 157 of these trials covered by Web of Science have been retracted, with to date 289 citations in systematic reviews, clinical guidelines and consensus statements. The 135 unretracted trials have 1,989 citations in systematic reviews, clinical guidelines and consensus statements. Presumably, some of these currently unretracted trials will eventually be retracted, further compounding the issue.
Often groups with multiple retractions have a fairly narrow research field (2,3). Including their publications in systematic reviews and guidelines on these topics can therefore have a disproportionately large impact (4,5). In the extreme case, after systematically assessing trustworthiness of eligible studies, all of them were excluded leaving no evidence base to assess (6).
Faced with such examples, it is difficult not to conclude that the current system for publication of scientific literature is fatally flawed.
References
(1) https://committees.parliament.uk/publications/39343/documents/194466/def... (accessed 29/04/2025)
(2) Bolland MJ, Gamble GD, Grey A, Avenell A. Empirically generated reference proportions for baseline p values from rounded summary statistics. Anaesthesia 2020;75:1683–92.
(3) Bolland MJ, Avenell A, Gamble GD, Grey A. Systematic review and statistical analysis of the integrity of 33 randomized controlled trials. Neurology 2016;87:2391-402.
(4) Avenell A, Stewart F, Grey A, Gamble G, Bolland M. An investigation into the impact and implications of published papers from retracted research: systematic search of affected literature. BMJ Open 2019;9:e031909.
(5) Avenell A, Bolland M, Gamble GD, Grey A. A randomized trial alerting authors, with or without co-authors or editors, that research they cited in systematic reviews and guidelines has been retracted. Accountability in Research 2024;31:14–37.
(6) Wang R. ‘And Then There Were None’—the shrinkage of trials in the evidence ecosystem. Human Reproduction 2025;https://doi.org/10.1093/humrep/deaf064
Competing interests: AA, MJB and AG are part of the UKRI metascience funded INSPECT-AI project: Evaluating the development and impact of AI-assisted integrity assessment of randomised trials in evidence syntheses (UKRI1083).
Dear Editor
The VITALITY Study 1 (1) provides compelling evidence on how retracted trials contaminate healthcare evidence and clinical practice. To support BMJ’s Open Science commitment, we suggest openly sharing included trial references. It would allow for a more detailed description of the retracted studies, improving understanding of the issue. Regarding solutions, automated alert systems alike the feet of clay tool are likely useful to warn when a systematic review (2) cites retracted research. However, while these tools target retracted trials, they may miss flawed but unretracted studies, “zombie trials”. (3) Initiatives such as INSPECT-SR (4) may aid to prospectively identify those studies and prevent their inclusion in meta-analyses. They may also overlook unreliable meta-analyses, such as when forest plots do not correspond to the original study data. (5)
However, identifying “zombie studies” is not enough if the literature fails to self-correct. Serious concerns can persist unaddressed for years. To illustrate, despite clear inconsistencies and evidence of 'cloned patients,' it took nearly two years to retract two Artemisia studies, while whistleblowers faced legal threats instead of support. (6) A trial on tacrolimus in children with atopic dermatitis showed numerous inconsistencies, including potential manual addition of error bars in bar plots (7) - yet no correction had been issued one year later. Another trial on emotional disorders in women with multiple sclerosis (8) which contains a fill-in-the-blank text among other inconsistencies reported in 11/18/2022 remains uncorrected. Rapid retraction of an esketamine trial was a rare positive example. (9) When the U.S. Food and Drug Administration finds major departures from good clinical practice, these issues are rarely disclosed or later corrected in trial publications. (10) Similarly, the eventual correction of the systematic reviews and guidelines polluted by retracted trials, as highlighted in the VITALITY study remains to be seen. Despite growing efforts to clean the literature, the issue may remain largely unsolved.
REFERENCES
1. Xu C, Fan S, Tian Y, et al. Investigating the impact of trial retractions on the healthcare evidence ecosystem (VITALITY Study I): retrospective cohort study. Bmj 2025;389:e082068. doi: 10.1136/bmj-2024-082068 [published Online First: 20250423]
2. Graña Possamai C, Cabanac G, Perrodeau E, et al. Inclusion of Retracted Studies in Systematic Reviews and Meta-Analyses of Interventions: A Systematic Review and Meta-Analysis. JAMA Intern Med 2025 doi: 10.1001/jamainternmed.2025.0256 [published Online First: 20250331]
3. Carlisle JB. False individual patient data and zombie randomised controlled trials submitted to Anaesthesia. Anaesthesia 2021;76(4):472-79. doi: 10.1111/anae.15263 [published Online First: 20201011]
4. Wilkinson J, Tovey D. How should we assess trustworthiness of randomized controlled trials? J Clin Epidemiol 2025;180:111670. doi: 10.1016/j.jclinepi.2025.111670 [published Online First: 20250113]
5. Shrestha B, Bhusal S, Kandel G, et al. Rivaroxaban versus dalteparin for the treatment of cancer-associated venous thromboembolism: a systematic review and meta-analysis. Ann Med Surg (Lond) 2025;87(3):1617-27. doi: 10.1097/ms9.0000000000003008 [published Online First: 20250207]
6. Mohamed AA, El Borolossy R, Salah EM, et al. A comparative randomized clinical trial evaluating the efficacy and safety of tacrolimus versus hydrocortisone as a topical treatment of atopic dermatitis in children. Front Pharmacol 2023;14:1202325. doi: 10.3389/fphar.2023.1202325 [published Online First: 20230920]
7. Sapunar L. A bitter aftertaste: Legal threats, alleged poisoning muddy the waters for a trial of a tea to treat malaria 2020 [Available from: https://retractionwatch.com/2020/08/05/a-bitter-aftertaste-legal-threats....
8. Nazari N, Aligholipour A, Sadeghi M. Transdiagnostic treatment of emotional disorders for women with multiple sclerosis: a randomized controlled trial. BMC Womens Health 2020;20(1):245. doi: 10.1186/s12905-020-01109-z [published Online First: 20201031]
9. Chen XZ, Wang DX. Notice of Retraction. Xu LL, et al. Efficacy and Safety of Esketamine for Supplemental Analgesia During Elective Cesarean Delivery: A Randomized Clinical Trial. JAMA Netw Open. 2023;6(4):e239321. JAMA Netw Open 2025;8(3):e256386. doi: 10.1001/jamanetworkopen.2025.6386 [published Online First: 20250303]
10. Seife C. Research misconduct identified by the US Food and Drug Administration: out of sight, out of mind, out of the peer-reviewed literature. JAMA Intern Med 2015;175(4):567-77. doi: 10.1001/jamainternmed.2014.7774
AUTHORS
Florian Naudet (ORCID: 0000-0003-3760-3801),1,2 Tala Jajieh (ORCID: 0009-0009-7459-5426),3 Silvy Laporte (ORCID: 0000-0001-6197-8668),3 Cedric Lemarchand (ORCID: 0009-0009-8695-1248)1
1: University Rennes, CHU Rennes, Inserm, EHESP, Irset (Institut de recherche en santé, environnement et travail) - UMR_S 1085, Centre d'investigation clinique de Rennes (CIC1414), Rennes, France
2: Institut Universitaire de France (IUF), Paris, France
3: Université Jean Monnet, Mines Saint-Étienne, INSERM Unité 1059, Santé Ingéniérie Biologie Saint-Étienne (SAINBIOSE), Saint-Étienne, France
Competing interests: FN received funding from the French National Research Agency and is a PI for the RestoRes project (research integrity in biomedical research), the French ministry of health and the French ministry of research. He is a work package leader in the OSIRIS project (Open Science to Increase Reproducibility in Science). FN is also work package leader for the doctoral network MSCA-DN SHARE-CTD (HORIZON-MSCA-2022-DN-01 101120360), funded by the EU. SL is a work package leader in the RestoRes project. She is also work package leader for the Morpheus project (EUROPE Horizon-HLTH-2022-TOOL-11-01 on Prognosis improvement of unprovoked venous thrombo-embolism using personalized anticoagulant therapy) funded by the EU. She received fees from Pfizer for a lecture symposia and fees from Ferring for a rereading of a meta-anlaysis. TJ and CL are PhD students in the RestoRes project.
Dear Editor,
The Journal rightly emphasized, with an editorial and an opinion paper, the importance of the investigation showing that trials can impact systematic reviews despite their retractation.(1-3) Certainly, it is hardly acceptable that grossly flawed clinical research can endlessly impact clinical practice. However, comments are warranted.
Firstly, the call for an automated alert system to warn authors and journals when a systematic review or a guideline cites retracted research with the expectation and for the sponsor to “be responsible for the correction” seems naïve.(2) We are afraid that artificial intelligence will be no match for human irresponsibility. The system cannot be fixed as none of the key players -- authors and sponsors, reviewers, editors, or publishers -- are accountable.
Secondly, the warning by Liu and colleagues that clinicians “must stop relying on P values because statistically significant results can be misleading.“(3) is a critical point as the issue is broader than the use of retracted studies, including our reliance on P values for trustworthiness of evidence. This would require a major mindset shift. As they also rightly noted that the problem extends beyond trials, highlighting “problematic non-randomised intervention studies and observational studies.“(3) Accordingly, here is a first step: banning the use of P values in observational studies unless the analysis has been pre-registred (e.g. Open Science Framework).(4,5) This must include post-hoc analysis of trials, subgroups analysis which are particularly prone to abuse. The practice of p-hacking is too widespread to ignore.(4) Data torture is obvious in the case of large cohorts that generated hundreds of publications, cherry-picking variables plus shifting categorizations.
Lastly, we must acknowledge that researchers will always be somewhat too complacent or too eager to see themselves as discoverers.(6)
References
1. Xu C, Fan S, Tian Y, etal. Investigating the impact of trial retractions on the healthcare evidence ecosystem (VITALITY Study I): retrospective cohort study. BMJ 2025;389:e082068.
doi: 10.1136/bmj-2024-082068.
2. Candal-Pedreira C, Ruano-Ravina A. Retracted studies in systematic reviews and clinical guidelines. BMJ. 2025;389:r724. doi:10.1136/bmj.r724
3. Liu F, Xu C, Doi SA, Chu H, Liu H. Problematic trials are contaminating the evidence ecosystem. BMJ. 2025;389:r809. doi:10.1136/bmj.r809
4. Braillon A, Naudet F. STROBE and pre-registration of observational studies. BMJ. 2023;380:90. doi:10.1136/bmj.p90
5. Naudet F, Patel CJ, DeVito NJ, et al. Improving the transparency and reliability of observational studies through registration. BMJ. 2024;384:e076123. doi:10.1136/bmj-2023-076123
6. Kucharski A. The Uncertain Science of Certainty. Profile Books. 2025. ISBN 9781788169080
Competing interests: No competing interests
Dear editor,
Xu and colleagues report on the impact of retracted randomised clinical trials (RCTs) on the healthcare evidence ecosystem. They investigated 1330 retracted RCTs and 847 systematic reviews that quantitatively synthesised these retracted trials (1). They concluded that retracted trials have a substantial impact on evidence synthesis, clinical practice guidelines, and evidence based clinical practice. While a 16% change in statistical significance might be expected when the total number of trials is reduced by removing retracted papers, the 8.4% reviews with a change in direction of treatment effect and the 15.7% with a >50% change in the magnitude of the effect are of much more concern.
While I congratulate the authors with their effort, I am afraid I have to disagree with them when they state that “the retraction of RCTs poses a severe threat to evidence based medicine”. Not so much the retracted papers are the problem, but much more so the fatally flawed and fabricated RCTs that remain unmarked. Recent estimates indicate that 25-40% of the RCTs are not trustworthy enough to contribute to evidence synthesis (2-5). The percentage problematic RCTs is higher than the 5-10% untrustworthy among all scientific papers since a) in real life RCTs are difficult to perform – an argument that does not count for fabricators -, b) RCTs are more attractive to fabricate as they are easier to publish and c) the characteristics of and regulations around RCTs allow better detection of fraud, for example balanced baseline characteristics and trial registration (6).
With my team I have flagged around 1,000 problematic papers in my field women’s health. We demonstrated that retractions only occur very slow, if they happen at all (7). From the list of 1,330 retractions used by Xu et al., I alone (I acknowledge the help of many students) have flagged more than 100 of the 1,330 retractions. The big elephant in the room is that retracted RCTs are only the small tip of the iceberg. As I am not aware of any journal or society taking action itself, the system of removing problematic papers is dependent on individual whiste-blowers. The COPE system, heavily relying on publishers and local universities - both strongly conflicted stakeholders - is completely ineffective. This is the experience of everyone who tries to get papers retracted, for example the Auckland group with Grey, Bolland and Avenell to name another force behind the 1330 retracted papers (8). Medicine is the Olympic Games without any doping checks.
Apart from the fabricators, and the limited capacity to detect their work either pre- or post-publication, the problem of research fraud is the unwillingness of clinical and academic communities to speak out and act against the fabricators and their products.
Congratulations to the 16 authors and the VITALITY research network with another 68 members for publishing in BMJ. I am curious to learn how many of the 84 contributors have flagged problematic papers to journals I) at least once or II) more than 5 times. I am also curious to hear their proposed response on the large amount of fatally flawed and even fabricated RCTs that remain unmarked.
References
1. Xu C, Fan S, Tian Y, Liu L, Furuya-Kanamori L, Clark J, Zhang C, Li S, Lin L, Chu H, Li S, Golder S, Loke Y, Vohra S, Glasziou P, Doi SA, Liu H. Investigating the impact of trial retractions on the healthcare evidence ecosystem (VITALITY Study I): retrospective cohort study. BMJ 2025; 389:e082068
2. Weeks J, Cuthbert A, Alfirevic Z. Trustworthiness assessment as an inclusion criterion for systematic reviews—what is the impact on results? Cochrane Ev Synth 2023;1:e12037.
3. Weibel S, Popp M, Reis S, Skoetz N, Garner P, Sydenham E. Identifying and managing problematic trials: a research integrity assessment tool for randomized controlled trials in evidence synthesis. Res Synth Methods 2023;14:357–369.
4. Mousa A, Flanagan M, Tay CT, Norman RJ, Costello M, Li W, Wang R, Teede H, Mol BW. Research Integrity in Guidelines and evIDence synthesis (RIGID): a framework for assessing research integrity in guideline development and evidence synthesis. EClinicalMedicine 2024;74:102717.
5. Carlisle JB. False individual patient data and zombie randomised controlled trials submitted to Anaesthesia. Anaesthesia. 2021;76(4):472-9.
6. Carlisle JB. Data fabrication and other reasons for non-random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals. Anaesthesia. 2017;72:944-952.
7. Shivantha S, Au NLS, Gurrin L, Thornton J, Nielsen J, Mol BW. Publishers' Response to Post-Publication Concerns About Clinical Research in Women's Health. BJOG 2025 Feb 26. doi: 10.1111/1471-0528.18100.
8. Bolland MJ, Grey A, Avenell A, Klein AA. Correcting the scientific record - A broken system? Account Res. 2021 Jul;28(5):265-279.
Competing interests: I report consultancy, travel support and research funding from Merck and consultancy for Ferring, Organon and Norgine.
Dear Editor,
We read with great interest the VITALITY Study I by Xu et al.[1] The authors provide a valuable large-scale empirical investigation quantifying the downstream cascade of retracted randomized controlled trials (RCTs) through systematic reviews (SRs) and into clinical practice guidelines. While acknowledging its significance, we wish to offer perspectives on the interpretation and broader implications of these findings.
First, evaluating evidence contamination can go beyond simply counting the number of affected SRs or guidelines, as not all outcomes within these documents are of equal clinical importance. The impact of retracted trials on critical outcomes, such as all-cause mortality or major morbidity, should arguably weigh more heavily in contamination assessments than effects on secondary or surrogate endpoints. We suggest that future evaluations of evidence contamination consider assigning importance-based weights to outcomes, allowing quantification of contamination not just by frequency, but by its potential to affect key clinical decisions.
Second, we advise caution in interpreting the observed propagation that one retracted trial reaches an average of 3 SRs and potentially 9 guidelines as a direct amplification of negative impact. This “contamination chain” is moderated by two key factors. At the SR level, although 1330 retracted trials were included in 847 SRs (4095 meta-analyses), the exclusion of these trials led to a change in effect direction or statistical significance in about 20.6% of cases. Thus, widespread inclusion of retracted trials does not equate to widespread distortion of synthesized evidence. Furthermore, at the guideline level, citation of a contaminated SR does not invariably mean the guideline's recommendations are distorted, given that guideline panels weigh evidence certainty, balance outcomes, as well as consider clinical context and patient values.[2] In many cases, a single flawed or low-certainty study may not meaningfully influence recommendations.
Third, although the study quantifies contamination within the evidence synthesis process, the downstream impact on clinical practice remains challenging to assess, owing to the well-documented evidence-to-practice gap.[3] Translating research findings through SRs and guidelines into tangible clinical behavior change (implementation or de-implementation of interventions) is complex and influenced by numerous factors beyond the evidence itself. While identifying contamination in guidelines is a critical first step, directly extrapolating this to widespread adverse clinical practice impact requires caution.
Finally, beyond the potential impact on guideline recommendations, the study highlights a potentially more insidious long-term impact on the research ecosystem itself (pathways 3 and 4 in Figure 3). Misleading findings can misdirect future research efforts and funding, perpetuating flawed lines of inquiry, potentially even creating a feedback loop if guidelines influenced by such research then shape future studies. This underscores the critical need for systemic solutions. Strengthening the integrity of primary research (e.g., trial registration, rigorous peer review, data sharing mandates), enhancing SR methodology (e.g., explicitly requiring and reporting checks for retracted publications, at least reducing the inclusion of already retracted papers during SR development), as well as developing rapid, effective mechanisms to flag retracted papers and alert authors of citing SRs/guidelines and the wider research community are paramount. This requires collaboration between journals, databases, researchers, guideline developers, and healthcare practitioners. Emerging technologies such as artificial intelligence[4] and the framework of living systematic reviews[5] may offer promising avenues to address the timeliness issue.
In summary, Xu et al.’s work makes an important contribution by quantifying the propagation of retracted trials and prompting further discussion. However, interpreting the true impact necessitates careful consideration of the complexities within evidence synthesis, guideline development, implementation science, and the research cycle itself. Continued systemic efforts are essential to enhance the integrity and resilience of our evidence ecosystem, and this study provides valuable data to guide such reforms.
Sincerely,
Na He
Department of Pharmacy, Peking University Third Hospital, Beijing, 100191, China
Drug Evaluation Center, Peking University Health Science Center, Beijing 100191, China
Suodi Zhai
Department of Pharmacy, Peking University Third Hospital, Beijing, 100191, China
Drug Evaluation Center, Peking University Health Science Center, Beijing 100191, China
Zhiling Zhang
Department of Pharmacy, Beijing Anzhen Hospital, Capital Medical University, Beijing, 100029, China.
References:
1. Xu C, Fan S, Tian Y, et al. Investigating the impact of trial retractions on the healthcare evidence ecosystem (VITALITY Study I): retrospective cohort study. BMJ. 2025;389:e082068. doi:10.1136/bmj-2024-082068
2. Guyatt G, Agoritsas T, Brignardello-Petersen R, et al. Core GRADE 1: overview of the Core GRADE approach. BMJ. 2025;389:e081903. doi:10.1136/bmj-2024-081903
3. Morris ZS, Wooding S, Grant J. The answer is 17 years, what is the question: understanding time lags in translational research. J R Soc Med. 2011;104(12):510-520. doi:10.1258/jrsm.2011.110180
4. Luo X, Chen F, Zhu D, et al. Potential Roles of Large Language Models in the Production of Systematic Reviews and Meta-Analyses. J Med Internet Res. 2024;26:e56780. doi:10.2196/56780
5. Elliott JH, Synnot A, Turner T, et al. Living systematic review: 1. Introduction-the why, what, when, and how. J Clin Epidemiol. 2017;91:23-30. doi:10.1016/j.jclinepi.2017.08.010
Competing interests: No competing interests
Dear Editor
The VITALITY study underscores a crisis quietly festering in our evidence ecosystem: the hidden weight of retracted trials embedded in meta-analyses and clinical guidelines [1]. This is not just a matter of methodological hygiene; it’s a philosophical fault line.
The age of asking “Is the evidence trustworthy?” may be ending.
We must now ask: “Is this where we want to invest our time, money, and moral weight?” [2]
AI systems are already matching or outperforming humans in diagnostic accuracy [3]. Real-world evidence is challenging the sanitized assumptions of randomized trials. And yet, much of contemporary medicine still rests on meta-analyses built on data later deemed invalid [1,4].
But high diagnostic performance alone is not enough.
What matters is not just what the AI or clinician predicts, but how that prediction is generated and contextualized. Without abductive reasoning—hypothesis generation grounded in meaning, context, and internal coherence—we risk optimizing for correctness without understanding. This is how “meta-analytic precision” may become detached from clinical wisdom.
Rather than defending the pyramid, we should be redesigning the terrain. If the foundation is crumbling under retracted trials, shouldn’t we ask what kind of evidence deserves to be called “meaningful”? Shouldn’t evidence-based medicine now include a commitment to retraction-aware, context-sensitive, and ethically grounded reasoning [5]?
Before we ask whether the data is trustworthy, we must ask whether it still makes sense in the lived reality of care.
Yours sincerely,
Kenjiro Shiraishi
Tanashi Kitaguchi Acupuncture Clinic
Tokyo, Japan
ORCID: 0009-0003-2550-7385
References
1. Xu C, Fan S, Yuan T, et al. Investigating the impact of trial retractions on the healthcare evidence ecosystem (VITALITY Study I): a retrospective cohort study. BMJ 2025;389:e082068. doi:10.1136/bmj-2024-082068
2. Greenhalgh T, Howick J, Maskrey N. Evidence based medicine: a movement in crisis? BMJ. 2014;348:g3725.
3. Rao A, Kim J, Kamineni M, et al. Evaluating GPT as a Radiologic Decision Support: A Pilot Comparison of GPT-4 and GPT-3.5 in Breast Imaging. J Am Coll Radiol. 2023 Oct.
4. Ioannidis JPA. Why most published research findings are false. PLoS Med. 2005;2(8):e124.
5. Mol A. The Logic of Care: Health and the Problem of Patient Choice. Routledge; 2008.
Competing interests: No competing interests
Block chain mechanism needed to avoid ill effects of retracted clinical trials on evidence based drug development, clinical guideline and human health
Dear Editor
The last decade has witnessed a concerning 10-fold increase in journal retractions, prompting extensive research into their ramifications on the credibility of various scientific fields and implications for patient health (1, 2, 3). Clinical trial retractions pose a significant risk to drug development, potentially leading to flawed research, inaccurate conclusions, and wasted resources. When trials are retracted due to errors, fraud, or other issues, it can undermine the entire process of drug discovery and approval. The impact of retractions extends beyond the immediate study, as retracted findings can influence subsequent research, guidelines, and even healthcare decisions.
Actually, the evidence pyramid, or hierarchy of evidence, is a framework used in evidence-based practice to rank the strength and reliability of research findings. It visualizes different types of studies, with the highest quality evidence at the top and less reliable studies at the bottom. At the top of the pyramid are systematic reviews and meta-analyses, followed by randomized controlled trials (RCTs) and then other observational studies. Double Blinded Randomized Control Trial is the most reliable “test” or study design and provides the strongest support of a cause and effect relationship. Whenever a clinical trial is retracted, the mechanism for its retraction reporting is not foolproof. Due to this reason the citation of even retracted clinical trials continues causing contamination to original and authentic evidences about cause and effect relationship. A recent systematic review identified 587 articles (525 SRs and 62 CPGs) citing retracted RCTs. Among the 587 articles, 252 (43%) were published after retraction, and 335 (57%) were published before retraction. Among 127 articles published citing already retracted RCTs in their evidence synthesis without caution, none corrected themselves after publication. (4)
Presently, there are no integrated systems to alert researchers that they have cited work with expressions of concern or retractions. The Zotero, EndNote, LibKey and Papers bibliographic systems can automatically alert researchers to entries in Retraction Watch’s database of expressions of concern or retractions (5).
The most important effect of retraction of clinical trials is on the clinical guidelines and pathway of drug development. Until that time, patients are at risk from the unrecognized and/or uncorrected effects of retracted publications in influential clinical guidance documents. Unless correcting systems are put in place, policymakers who use affected clinical guidelines could unintentionally put patients at risk of harm. (6)
This citation of retracted publications can be avoided by top medical journals by ensuring few steps. While manuscript submission authors need to check that they have not cited retracted work, and confirm this in the submission process (7). The International Committee of Medical Journal Editors has long had guidance for this (8) journals should require confirmation at the submission stage. Confirmation that retracted papers have been excluded from systematic reviews should be stated in the publication and could be included in the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist for reporting of systematic reviews (9). Authors should also confirm at submission of systematic reviews and guidelines that they will publish a note updating their publication if cited references in their evidence syntheses are retracted. Publishers also need to improve quality control of their publications by checking submitted and published articles for citations to retracted work, using software packages coupled with Retraction Watch’s database of retractions.
There is a need to place a block chain system that can regulate and minimize this citation vs retraction business for human health at large.
1. Brainard J, You J. What a massive database of retracted papers reveals about science publishing’s ‘death penalty’. Science. 2018; 25(1):1-5.
2. Audisio K, Soletti GJ, Cancelli G, Olaria RP, Rahouma M, Gaudino M, Rong LQ. Systematic review of retracted articles in critical care medicine. BJA: British Journal of Anaesthesia. 2022; 128 (4):e292.
3. Gaudino M, Robinson NB, Audisio K, Rahouma M, Benedetto U, Kurlansky P, Fremes SE. Trends and characteristics of retracted articles in the biomedical literature, 1971 to 2020. JAMA internal medicine. 2021; 181(8):1118-21.
4. Kataoka Y, Banno M, Tsujimoto Y, Ariie T, Taito S, Suzuki T, Oide S, Furukawa TA. Retracted randomized controlled trials were cited and not corrected in systematic reviews and clinical practice guidelines. J Clin Epidemiol. 2022;150: 90-97.
5. Bolland, M. J., A. Grey, and A. Avenell. 2022. “Citation of Retracted Publications: A Challenging Problem.” Accountability in Research 29: 18–25.
6. Avenell A, Bolland MJ, Gamble GD, Grey A. A randomized trial alerting authors, with or without coauthors or editors, that research they cited in systematic reviews and guidelines has been retracted. Accountability in Research. 2024;31(1):14-37.
7. Sox HC, Rennie D. Research misconduct, retraction, and cleansing the medical literature: lessons from the Poehlman case. Annals of Internal Medicine. 2006 Apr 18;144(8):609-13.
8. Peh WC, Ng KH. Preparing a manuscript for submission. Singapore medical journal. 2009 Aug 1;50(8):759-61.
9. Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, Shamseer L, Tetzlaff JM, Akl EA, Brennan SE, Chou R. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. bmj. 2021;372.
Competing interests: No competing interests