Chemical research in toxicology
-
Chem. Res. Toxicol. · Jun 2011
A comprehensive support vector machine binary hERG classification model based on extensive but biased end point hERG data sets.
The human ether-a-go-go related gene (hERG) potassium ion channel plays a key role in cardiotoxicity and is therefore a key target as part of preclinical drug discovery toxicity screening. The PubChem hERG Bioassay data set, composed of 1668 compounds, was used to construct an in silico screening model. The corresponding trial models were constructed from a descriptor pool composed of 4D fingerprints (4D-FP) and traditional 2D and 3D VolSurf-like molecular descriptors. ⋯ The linear SVM binary classification model building strategy was applied to different combinations of MOE (traditional 2D, "21/2D", and 3D VolSurf-like) and 4D-FP molecular descriptors to further explore and refine previously proposed key descriptors, identify new significant features that contribute to the prediction of hERG toxicity, and construct the optimal SVM binary classification model from a shrunken descriptor pool. The accuracy, sensitivity, and specificity of the best model determined from 10-fold cross-validation are 95, 90, and 96%, respectively; the overall accuracy is near 87% for the external set. The models constructed in this study demonstrate the following: (i) robustness based upon performance in accuracy across the structural diversity of the training set, (ii) ability to predict a compound's "predisposition" to block hERG ion channels, and (iii) define and illustrate structural features that can be overlaid onto the chemical structures to aid in the 3D structure-activity interpretation of the hERG blocking effect.