Development of an unsupervised machine learning algorithm for

Spine J · Feb 2020

Development of an unsupervised machine learning algorithm for the prognostication of walking ability in spinal cord injury patients.

Traumatic spinal cord injury can have a dramatic effect on a patient's life. The degree of neurologic recovery greatly influences a patient's treatment and expected quality of life. This has resulted in the development of machine learning algorithms (MLA) that use acute demographic and neurologic information to prognosticate recovery. The van Middendorp et al. (2011) (vM) logistic regression (LR) model has been established as a reference model for the prediction of walking recovery following spinal cord injury as it has been validated within many different countries. However, an examination of the way in which these prediction models are evaluated is warranted. The area under the receiver operators curve (AUROC) has been consistently used when evaluating model performance, but it has been shown that AUROC overemphasizes the most common event resulting in an inaccurate assessment when the data are imbalanced. Furthermore, there is evidence that the use of more advanced MLA, such as an unsupervised k-means model, may show superior performance compared to LR as they can handle a larger number of features. ⋯ No clinically relevant differences were found between the use of an unsupervised MLA with complete admission neurologic information compared to the previously validated standards; however, when comparing the performance of the AUROC and F1-score, the AUROC showed inaccurate prognostic performance when there was an imbalance toward a greater amount of false negatives. Importantly, the F1-score did not succumb to this imbalance. As AUROC has been used as the standard when evaluating performance of prediction models, consideration as to whether this is the most appropriate method is warranted. Future work should focus on comparing AUROC and F1-scores with other previously validated models.

keep going… or not…
- Zachary DeVries, Mohamad Hoda, Carly S Rivers, Audrey Maher, Eugene Wai, Dita Moravek, Alexandra Stratton, Stephen Kingwell, Nader Fallah, Jérôme Paquet, Philippe Phan, and RHSCIR Network.
- Ottawa Spine Collaborative Analytics Network, The Ottawa Hospital, Ottawa, ON K1Y 4E9, Canada.
- Spine J. 2020 Feb 1; 20 (2): 213-224.
Background ContextTraumatic spinal cord injury can have a dramatic effect on a patient's life. The degree of neurologic recovery greatly influences a patient's treatment and expected quality of life. This has resulted in the development of machine learning algorithms (MLA) that use acute demographic and neurologic information to prognosticate recovery. The van Middendorp et al. (2011) (vM) logistic regression (LR) model has been established as a reference model for the prediction of walking recovery following spinal cord injury as it has been validated within many different countries. However, an examination of the way in which these prediction models are evaluated is warranted. The area under the receiver operators curve (AUROC) has been consistently used when evaluating model performance, but it has been shown that AUROC overemphasizes the most common event resulting in an inaccurate assessment when the data are imbalanced. Furthermore, there is evidence that the use of more advanced MLA, such as an unsupervised k-means model, may show superior performance compared to LR as they can handle a larger number of features.PurposeThe first objective of the study was to assess the performance of both an unsupervised MLA and LR model with complete admission neurologic information against the vM and Hicks models. Second, a comparison between the accuracy of the AUROC and the F1-score will be made to determine which method is superior for the assessment of diagnostic performance of prediction models on large-scale datasets.Study DesignRetrospective review of a prospective cohort study.Patient SampleThe Rick Hansen Spinal Cord Injury Registry (RHSCIR) was used in this study. All patients enrolled between 2004 and 2017 with complete neurologic examination and Functional Independence Measure outcome data at ≥1 year follow-up or who could walk at discharge were included. The prognostic variables included age (dichotomized at ≥65 years old); American Spinal Injury Association Impairment Scale (AIS) grade; and individual motor, light touch, and pinprick score from L2 to S1.Outcome MeasuresThe Functional Independence Measure locomotor score was used to assess independent walking ability at discharge or 1-year follow-up.MethodsAn unsupervised MLA with k=2 was chosen in order to identify a "walk" cluster and a "not walk" cluster. Model performance was assessed through the development of a receiver operating characteristic curve with associated AUROC and a precision-recall curve with associated F1-score. The study and the RHSCIR are supported by funding from Health Canada, Western Economic Diversification Canada, and the Governments of Alberta, British Columbia, Manitoba, and Ontario. These funders had no role in the study or study reporting and the authors have no conflicts of interest to report.ResultsNo clinically relevant differences were found between with the use of an unsupervised MLA with a greater amount of initial neurologic information compared to the established standards for any AIS classification. Although demonstrated for all separate AIS classifications, most notably, the AUROC for the vM (0.78) and Hicks models (0.76) were found to be superior to that of the new LR model (0.72); however, the vM and Hicks models had more than double the amount of false negative classifications compared to the LR. The F1-scores between these three models were also found to be different but with the vM and Hicks models being lower than the LR (0.85, 0.81, and 0.89, respectively).ConclusionsNo clinically relevant differences were found between the use of an unsupervised MLA with complete admission neurologic information compared to the previously validated standards; however, when comparing the performance of the AUROC and F1-score, the AUROC showed inaccurate prognostic performance when there was an imbalance toward a greater amount of false negatives. Importantly, the F1-score did not succumb to this imbalance. As AUROC has been used as the standard when evaluating performance of prediction models, consideration as to whether this is the most appropriate method is warranted. Future work should focus on comparing AUROC and F1-scores with other previously validated models.Copyright © 2019 Elsevier Inc. All rights reserved.

Pubmed Full text Copy Citation Plaintext

Add institutional full text...
Notes
Knowledge, pearl, summary or comment to share?

300 characters remaining

help

You can also include formatting, links, images and footnotes in your notes

Simple formatting can be added to notes, such as *italics*, _underline_ or **bold**.

Superscript can be denoted by <sup>text</sup> and subscript <sub>text</sub>.

Numbered or bulleted lists can be created using either numbered lines 1. 2. 3., hyphens - or asterisks *.

Links can be included with: [my link to pubmed](http://pubmed.com)

Images can be included with: ![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")

For footnotes use [^1](This is a footnote.) inline.

Or use an inline reference [^1] to refer to a longer footnote elseweher in the document [^1]: This is a long footnote..
hide…

Development of an unsupervised machine learning algorithm for the prognostication of walking ability in spinal cord injury patients.

Notes

300 characters remaining

help

You can also include formatting, links, images and footnotes in your notes