-
Clin. Orthop. Relat. Res. · Feb 2019
Can Machine Learning Methods Produce Accurate and Easy-to-use Prediction Models of 30-day Complications and Mortality After Knee or Hip Arthroplasty?
- Harris Alex H S AHS A. H. S. Harris, T. Bowe, N. J. Giori Center for Innovation to Implementation, VA Palo Alto Healthcare System, Palo Alto, CA, USA A. C. Kuo San Franci, Alfred C Kuo, Yingjie Weng, Amber W Trickey, Thomas Bowe, and Nicholas J Giori.
- A. H. S. Harris, T. Bowe, N. J. Giori Center for Innovation to Implementation, VA Palo Alto Healthcare System, Palo Alto, CA, USA A. C. Kuo San Francisco Veterans Affairs Medical Center, University of California, San Francisco, CA, USA A H. S. Harris, Y. Weng, A. W. Trickey Stanford-Surgical Policy Improvement Research and Education Center, Stanford, CA, USA N. J. Giori Department of Orthopedic Surgery, Stanford University School of Medicine, Stanford, CA, USA.
- Clin. Orthop. Relat. Res. 2019 Feb 1; 477 (2): 452-460.
BackgroundExisting universal and procedure-specific surgical risk prediction models of death and major complications after elective total joint arthroplasty (TJA) have limitations including poor transparency, poor to modest accuracy, and insufficient validation to establish performance across diverse settings. Thus, the need remains for accurate and validated prediction models for use in preoperative management, informed consent, shared decision-making, and risk adjustment for reimbursement.Questions/PurposesThe purpose of this study was to use machine learning methods and large national databases to develop and validate (both internally and externally) parsimonious risk-prediction models for mortality and complications after TJA.MethodsPreoperative demographic and clinical variables from all 107,792 nonemergent primary THAs and TKAs in the 2013 to 2014 American College of Surgeons-National Surgical Quality Improvement Program (ACS-NSQIP) were evaluated as predictors of 30-day death and major complications. The NSQIP database was chosen for its high-quality data on important outcomes and rich characterization of preoperative demographic and clinical predictors for demographically and geographically diverse patients. Least absolute shrinkage and selection operator (LASSO) regression, a type of machine learning that optimizes accuracy and parsimony, was used for model development. Tenfold validation was used to produce C-statistics, a measure of how well models discriminate patients who experience an outcome from those who do not. External validation, which evaluates the generalizability of the models to new data sources and patient groups, was accomplished using data from the Veterans Affairs Surgical Quality Improvement Program (VASQIP). Models previously developed from VASQIP data were also externally validated using NSQIP data to examine the generalizability of their performance with a different group of patients outside the VASQIP context.ResultsThe models, developed using LASSO regression with diverse clinical (for example, American Society of Anesthesiologists classification, comorbidities) and demographic (for example, age, gender) inputs, had good accuracy in terms of discriminating the likelihood a patient would experience, within 30 days of arthroplasty, a renal complication (C-statistic, 0.78; 95% confidence interval [CI], 0.76-0.80), death (0.73; 95% CI, 0.70-0.76), or a cardiac complication (0.73; 95% CI, 0.71-0.75) from one who would not. By contrast, the models demonstrated poor accuracy for venous thromboembolism (C-statistic, 0.61; 95% CI, 0.60-0.62) and any complication (C-statistic, 0.64; 95% CI, 0.63-0.65). External validation of the NSQIP- derived models using VASQIP data found them to be robust in terms of predictions about mortality and cardiac complications, but not for predicting renal complications. Models previously developed with VASQIP data had poor accuracy when externally validated with NSQIP data, suggesting they should not be used outside the context of the Veterans Health Administration.ConclusionsModerately accurate predictive models of 30-day mortality and cardiac complications after elective primary TJA were developed as well as internally and externally validated. To our knowledge, these are the most accurate and rigorously validated TJA-specific prediction models currently available (http://med.stanford.edu/s-spire/Resources/clinical-tools-.html). Methods to improve these models, including the addition of nonstandard inputs such as natural language processing of preoperative clinical progress notes or radiographs, should be pursued as should the development and validation of models to predict longer term improvements in pain and function.Level Of EvidenceLevel III, diagnostic study.
Notes
Knowledge, pearl, summary or comment to share?You can also include formatting, links, images and footnotes in your notes
- Simple formatting can be added to notes, such as
*italics*
,_underline_
or**bold**
. - Superscript can be denoted by
<sup>text</sup>
and subscript<sub>text</sub>
. - Numbered or bulleted lists can be created using either numbered lines
1. 2. 3.
, hyphens-
or asterisks*
. - Links can be included with:
[my link to pubmed](http://pubmed.com)
- Images can be included with:
![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
- For footnotes use
[^1](This is a footnote.)
inline. - Or use an inline reference
[^1]
to refer to a longer footnote elseweher in the document[^1]: This is a long footnote.
.