TY - JOUR
T1 - QSRR modeling for diverse drugs using different feature selection methods coupled with linear and nonlinear regressions
AU - Goodarzi, Mohammad
AU - Jensen, Richard
AU - Vander Heyden, Yvan
PY - 2012/12/1
Y1 - 2012/12/1
N2 - A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (log k(w)). The overall best model was the SVM one built using descriptors selected by ACO. (C) 2012 Elsevier B.V. All rights reserved.
AB - A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (log k(w)). The overall best model was the SVM one built using descriptors selected by ACO. (C) 2012 Elsevier B.V. All rights reserved.
KW - Chromatographic retention
KW - STRUCTURE-RETENTION RELATIONSHIP
KW - PREDICTION
KW - QSAR
KW - SPLINES
KW - CHROMATOGRAPHIC RETENTION
KW - Relief
KW - WATER PARTITION-COEFFICIENT
KW - QSRR
KW - CLASSIFICATION
KW - ACO
KW - MLR
KW - VARIABLE SELECTION
KW - SVM
UR - http://hdl.handle.net/2160/8950
U2 - 10.1016/j.jchromb.2012.01.012
DO - 10.1016/j.jchromb.2012.01.012
M3 - Article
SN - 1570-0232
VL - 910
SP - 84
EP - 94
JO - Journal of Chromatography B: Analytical Technologies in the Biomedical and Life Sciences
JF - Journal of Chromatography B: Analytical Technologies in the Biomedical and Life Sciences
ER -