The Global Landscape of Artificial Intelligence Randomized Clinical Trials

A bibliometric analysis and systematic quality assessment using the exploratory Critical Appraisal Method for Evaluating Outcomes in AI (CAMEO-AI)

CAMEO-AI · PubMed + Embase + Scopus, Jan 2020–Aug 2025 · N = 2,826 randomized trials

Zahid Durrani¹ · Zubia Suhail² · Asima Faisal³ · Laeeq Malik¹ · Munazza Tayyab⁴ · Kheem Dharmani¹ · Zaryan Hasan⁷ · Sahar Fatima⁵ · Izhar Hasan³·⁶ ★ presenting author

1 MD ACCES, Karachi, Pakistan 2 Baqai Institute of Diabetology & Endocrinology, Karachi, Pakistan 3 Dow University of Health Sciences, Karachi, Pakistan 4 Rahbar Medical College, Lahore, Pakistan 5 Ysbyty Gwynedd Hospital, Betsi Cadwaladr University Health Board, Wales, UK 6 Hackensack Meridian School of Medicine, Nutley, NJ, USA 7 MD ACCES, Princeton, NJ, USA

Corpus

2,826 RCTs

AI / machine-learning randomized clinical trials, Jan 2020–Aug 2025 — up from 266 in 2020 to 660 in the first 8 months of 2025.

Risk of bias

68%

of trials show high risk of bias or significant concerns on Cochrane RoB 2.

Exploratory tool

190-pt

CAMEO-AI framework, piloted descriptively on a stratified subsample of 200 trials.

iWhat this page is

The full data behind the poster

This companion site expands every figure summarized on the printed EBM Live poster — full specialty, geographic, and journal breakdowns; the complete risk-of-bias and AI-reporting picture; the six CAMEO-AI domains; and a structured comparison against RoB 2, CONSORT-AI, and SPIRIT-AI. Use the tabs above, or scan the QR code from anywhere on the page.

Study design

Mixed-methods bibliometric analysis + systematic quality assessment. Multi-database search of PubMed, Embase & Scopus for English-language RCTs of ML/AI clinical interventions, Jan 2020–Aug 2025. After de-duplication and QC: N = 2,826 trials.

Quality instruments applied

Cochrane RoB 2 and the CONSORT-AI extension across the full corpus; the SPIRIT-AI extension among trials with protocol materials available; CAMEO-AI piloted descriptively on a stratified n=200 subsample.

Headline finding

AI RCT volume grew 2.5× from 2020 to 2025, but core AI-specific safeguards — external validation, fairness assessment, leakage controls — are documented in well under a third of trials.

06CAMEO‑AI framework

The Critical Appraisal Method for Evaluating Outcomes in AI

CAMEO-AI is a 190-item exploratory framework spanning six domains, piloted descriptively on a stratified subsample of 200 trials. It is proposed as a complement to — not a replacement for — RoB 2, CONSORT-AI, and SPIRIT-AI. Click each domain to expand.

Whether the trial's eligibility criteria, comparator, randomization scheme, and outcome selection are appropriate for an AI-enabled intervention — including whether the comparator reflects real-world clinical workflow rather than an idealized baseline, and whether outcomes capture clinically meaningful endpoints rather than purely algorithmic performance metrics.

Provenance, representativeness, and documentation of the data used to train and evaluate the AI system — including population coverage, label quality, handling of missing or noisy inputs, and disclosure of data sources, consistent with the data-leakage and provenance gaps identified across the corpus.

Soundness of model selection, training procedure, hyperparameter tuning, and handling of class imbalance or confounding — the technical complement to RoB 2's randomization and outcome-measurement domains, adapted to algorithmic rather than purely statistical methods.

External and prospective temporal validation, calibration, and subgroup performance — directly probing the practices found to be documented in well under one-third of trials corpus-wide (external validation 23%, temporal validation 31%).

Code, model, and data availability; versioning; and sufficiency of reporting for independent replication — mapping onto the corpus-wide code-availability rate of just 18%.

Algorithmic fairness assessment across demographic subgroups, informed-consent handling of AI-specific risks, and regulatory clearance status — the domain most sparsely documented in the corpus, with fairness assessment present in just 12% of trials.

CAMEO-AI pilot (n = 200, exploratory)

Marked AI-specific methodological variability — even among trials meeting conventional RoB 2 / CONSORT-AI thresholds. Recurrent gaps: data provenance, external & temporal validation, transparency / reproducibility documentation, subgroup & fairness analysis, and ethical-regulatory oversight. Domain- and overall-level numeric scores are summarized descriptively in the source study and are not reproduced here, as CAMEO-AI has not yet been independently validated.

✓Framework comparison

CAMEO-AI vs. RoB 2, CONSORT-AI & SPIRIT-AI

A structural comparison of all four instruments, drawn from their original publications. CAMEO-AI is exploratory and not yet validated; it is shown here as a proposed complementary layer, not a substitute for the other three.

Dimension	RoB 2	CONSORT-AI	SPIRIT-AI	CAMEO-AI
Primary purpose	Risk-of-bias judgement for a trial result	Completeness of trial reporting	Completeness of trial protocol reporting	Broader methodological & translational-readiness appraisal
Trial stage assessed	Completed trial, results stage	Trial report, publication stage	Trial protocol, design / pre-registration stage	Spans protocol through post-deployment
Structure	5 bias domains + signalling-question algorithm	37 core CONSORT 2010 items + 14 AI items (11 ext. + 3 elab.)	SPIRIT 2013 items + 15 AI items (12 ext. + 3 elab.)	190 items across 6 domains
Purpose-built for AI?	No — general-purpose	Yes	Yes	Yes
Output format	Low / Some concerns / High, per domain + overall	Item-by-item completeness checklist	Item-by-item completeness checklist	Descriptive domain & overall scores
Validation status	Cochrane-endorsed standard (2019)	Consensus-developed, EQUATOR-registered (2020)	Consensus-developed, EQUATOR-registered (2020)	Exploratory — not yet independently validated
Source	Sterne JAC et al., BMJ 2019	Liu X et al., Nat Med / BMJ / Lancet Digital Health 2020	Rivera SC et al., Nat Med / BMJ / Lancet Digital Health 2020	This study (EBM Live, Submission #56)

CAMEO-AI is designed to sit alongside — not replace — the other three: RoB 2 still judges internal validity, CONSORT-AI/SPIRIT-AI still govern reporting completeness, and CAMEO-AI adds an AI-lifecycle lens (data, methodology, validation, transparency, ethics) that none of the three was built to cover.

The Global Landscape of Artificial Intelligence Randomized Clinical Trials

The full data behind the poster

Bibliometric details, full corpus (N = 2,826)

Risk of bias, reporting completeness & AI-specific practices

The Critical Appraisal Method for Evaluating Outcomes in AI

CAMEO-AI vs. RoB 2, CONSORT-AI & SPIRIT-AI

Search strategy, screening & appraisal pipeline

Take-home message, citations & team

Scan to open this page