-
Comparative Study
Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil: accuracy study.
- André Rodrigues Olivera, Valter Roesler, Cirano Iochpe, Maria Inês Schmidt, Álvaro Vigo, Sandhi Maria Barreto, and Bruce Bartholow Duncan.
- MSc. IT Analyst, Postgraduate Computing Program, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre (RS), Brazil.
- Sao Paulo Med J. 2017 May 1; 135 (3): 234246234-246.
Context And Objective:Type 2 diabetes is a chronic disease associated with a wide range of serious health complications that have a major impact on overall health. The aims here were to develop and validate predictive models for detecting undiagnosed diabetes using data from the Longitudinal Study of Adult Health (ELSA-Brasil) and to compare the performance of different machine-learning algorithms in this task.Design And Setting:Comparison of machine-learning algorithms to develop predictive models using data from ELSA-Brasil.Methods:After selecting a subset of 27 candidate variables from the literature, models were built and validated in four sequential steps: (i) parameter tuning with tenfold cross-validation, repeated three times; (ii) automatic variable selection using forward selection, a wrapper strategy with four different machine-learning algorithms and tenfold cross-validation (repeated three times), to evaluate each subset of variables; (iii) error estimation of model parameters with tenfold cross-validation, repeated ten times; and (iv) generalization testing on an independent dataset. The models were created with the following machine-learning algorithms: logistic regression, artificial neural network, naïve Bayes, K-nearest neighbor and random forest.Results:The best models were created using artificial neural networks and logistic regression. -These achieved mean areas under the curve of, respectively, 75.24% and 74.98% in the error estimation step and 74.17% and 74.41% in the generalization testing step.Conclusion:Most of the predictive models produced similar results, and demonstrated the feasibility of identifying individuals with highest probability of having undiagnosed diabetes, through easily-obtained clinical data.
Notes
Knowledge, pearl, summary or comment to share?You can also include formatting, links, images and footnotes in your notes
- Simple formatting can be added to notes, such as
*italics*
,_underline_
or**bold**
. - Superscript can be denoted by
<sup>text</sup>
and subscript<sub>text</sub>
. - Numbered or bulleted lists can be created using either numbered lines
1. 2. 3.
, hyphens-
or asterisks*
. - Links can be included with:
[my link to pubmed](http://pubmed.com)
- Images can be included with:
![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
- For footnotes use
[^1](This is a footnote.)
inline. - Or use an inline reference
[^1]
to refer to a longer footnote elseweher in the document[^1]: This is a long footnote.
.