Availability and reporting quality of external validations of machinelearning prediction models with orthopedic surgical outcomes: a systematic review

Authors

  • Olivier Q Groot Orthopedic Oncology Service, Massachusetts General Hospital, Harvard Medical School, Boston, USA; Department of Orthopedic Surgery, University Medical Center Utrecht, Utrecht University, The Netherlands
  • Bas J J Bindels Department of Orthopedic Surgery, University Medical Center Utrecht, Utrecht University, The Netherlands
  • Paul T Ogink Department of Orthopedic Surgery, University Medical Center Utrecht, Utrecht University, The Netherlands
  • Neal D Kapoor Orthopedic Oncology Service, Massachusetts General Hospital, Harvard Medical School, Boston, USA
  • Peter K Twining Orthopedic Oncology Service, Massachusetts General Hospital, Harvard Medical School, Boston, USA
  • Austin K Collins Orthopedic Oncology Service, Massachusetts General Hospital, Harvard Medical School, Boston, USA
  • Michiel E R Bongers Orthopedic Oncology Service, Massachusetts General Hospital, Harvard Medical School, Boston, USA
  • Amanda Lans Orthopedic Oncology Service, Massachusetts General Hospital, Harvard Medical School, Boston, USA; Department of Orthopedic Surgery, University Medical Center Utrecht, Utrecht University, The Netherlands
  • Jacobien H F Oosterhoff Orthopedic Oncology Service, Massachusetts General Hospital, Harvard Medical School, Boston, USA
  • Aditya V Karhade Orthopedic Oncology Service, Massachusetts General Hospital, Harvard Medical School, Boston, USA
  • Jorrit-Jan Verlaan Department of Orthopedic Surgery, University Medical Center Utrecht, Utrecht University, The Netherlands
  • Joseph H Schwab Orthopedic Oncology Service, Massachusetts General Hospital, Harvard Medical School, Boston, USA

DOI:

https://doi.org/10.1080/17453674.2021.1910448

Abstract

Background and purpose — External validation of machine learning (ML) prediction models is an essential step before clinical application. We assessed the proportion, performance, and transparent reporting of externally validated ML prediction models in orthopedic surgery, using the Transparent Reporting for Individual Prognosis or Diagnosis (TRIPOD) guidelines.

Material and methods — We performed a systematic search using synonyms for every orthopedic specialty, ML, and external validation. The proportion was determined by using 59 ML prediction models with only internal validation in orthopedic surgical outcome published up until June 18, 2020, previously identified by our group. Model performance was evaluated using discrimination, calibration, and decision-curve analysis. The TRIPOD guidelines assessed transparent reporting.

Results — We included 18 studies externally validating 10 different ML prediction models of the 59 available ML models after screening 4,682 studies. All external validations identified in this review retained good discrimination. Other key performance measures were provided in only 3 studies, rendering overall performance evaluation difficult. The overall median TRIPOD completeness was 61% (IQR 43–89), with 6 items being reported in less than 4/18 of the studies.

Interpretation — Most current predictive ML models are not externally validated. The 18 available external validation studies were characterized by incomplete reporting of performance measures, limiting a transparent examination of model performance. Further prospective studies are needed to validate or refute the myriad of predictive ML models in orthopedics while adhering to existing guidelines. This ensures clinicians can take full advantage of validated and clinically implementable ML decision tools.

Downloads

Download data is not yet available.

Downloads

Additional Files

Published

2021-04-18

How to Cite

Groot, O. Q., Bindels, B. J. J., Ogink, P. T., Kapoor, N. D., Twining, P. K., Collins, A. K., … Schwab, J. H. (2021). Availability and reporting quality of external validations of machinelearning prediction models with orthopedic surgical outcomes: a systematic review. Acta Orthopaedica, 92(4), 385–393. https://doi.org/10.1080/17453674.2021.1910448