TY - JOUR
T1 - Ant colony optimization as a feature selection method in the QSAR modeling of anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives using MLR, PLS and SVM regression
AU - Goodarzi, Mohammad
AU - Freitas, Matheus P.
AU - Jensen, Richard
N1 - M. Goodarzi, M.P. Freitas, and R. Jensen. Ant colony optimization as a feature selection method in the QSAR modeling of anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives using MLR, PLS and SVM regression. Chemometrics and Intelligent Laboratory Systems, vol. 98, no. 2, pp. 123-129, 2009.
PY - 2009/10/15
Y1 - 2009/10/15
N2 - A quantitative structure–activity relationship (QSAR) modeling was carried out for the anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives. The ant colony optimization (ACO) strategy was used as a feature selection (descriptor selection) and model development method. Modeling of the relationship between selected molecular descriptors and pEC50 data was achieved by linear (multiple linear regression—MLR, and partial least squares regression—PLS) and nonlinear (support-vector machine regression; SVMR) methods. The QSAR models were validated by cross-validation, as well as through the prediction of activities of an external set of compounds. Both linear and nonlinear methods were found to be better than a PLS-based method using forward stepwise selection, resulting in accurate predictions, especially for the SVM regression. The squared correlation coefficients of experimental versus predicted activities for the test set obtained by MLR, PLS and SVMR models using ACO feature selection were 0.942, 0.945 and 0.991, respectively.
AB - A quantitative structure–activity relationship (QSAR) modeling was carried out for the anti-HIV-1 activities of 3-(3,5-dimethylbenzyl)uracil derivatives. The ant colony optimization (ACO) strategy was used as a feature selection (descriptor selection) and model development method. Modeling of the relationship between selected molecular descriptors and pEC50 data was achieved by linear (multiple linear regression—MLR, and partial least squares regression—PLS) and nonlinear (support-vector machine regression; SVMR) methods. The QSAR models were validated by cross-validation, as well as through the prediction of activities of an external set of compounds. Both linear and nonlinear methods were found to be better than a PLS-based method using forward stepwise selection, resulting in accurate predictions, especially for the SVM regression. The squared correlation coefficients of experimental versus predicted activities for the test set obtained by MLR, PLS and SVMR models using ACO feature selection were 0.942, 0.945 and 0.991, respectively.
KW - QSAR
KW - Anti-HIV-1 activities
KW - 3-(3,5-Dimethylbenzyl)uracil derivatives
KW - Ant colony optimization
KW - Linear and nonlinear regression methods
U2 - 10.1016/j.chemolab.2009.05.005
DO - 10.1016/j.chemolab.2009.05.005
M3 - Article
SN - 0169-7439
VL - 89
SP - 123
EP - 129
JO - Chemometrics and Intelligent Laboratory Systems
JF - Chemometrics and Intelligent Laboratory Systems
IS - 2
ER -