Bmc Med Res Methodol
-
Bmc Med Res Methodol · Feb 2017
Performance of Firth-and logF-type penalized methods in risk prediction for small or sparse binary data.
When developing risk models for binary data with small or sparse data sets, the standard maximum likelihood estimation (MLE) based logistic regression faces several problems including biased or infinite estimate of the regression coefficient and frequent convergence failure of the likelihood due to separation. The problem of separation occurs commonly even if sample size is large but there is sufficient number of strong predictors. In the presence of separation, even if one develops the model, it produces overfitted model with poor predictive performance. Firth-and logF-type penalized regression methods are popular alternative to MLE, particularly for solving separation-problem. Despite the attractive advantages, their use in risk prediction is very limited. This paper evaluated these methods in risk prediction in comparison with MLE and other commonly used penalized methods such as ridge. ⋯ The logF-type penalized method, particularly logF(1,1) could be used in practice when developing risk model for small or sparse data sets.