This work investigates variable selection and classification for biomedical datasets with a small sample size and a very high input dimension. The sequential sparse Bayesian learning methods with linear bases are used as the basic variable selection algorithm. Selected variables are fed to the kernel based probabilistic classifiers: Bayesian least squares support vector machines (LS-SVMs) and relevance vector machines (RVMs). We employ the bagging techniques for both variable selection and model building in order to improve the reliability of the selected variables and the predictive performance. This modelling strategy is applied to real-life medical classification problems, including two binary cancer diagnosis problems based on microarray data and a brain tumor multiclass classification problem using spectra acquired via magnetic resonance spectroscopy. The work is experimentally compared to other variable selection methods. It is shown that the use of bagging can improve the reliability and stability of both variable selection and model prediction.
|Number of pages||10|
|Journal||IEEE Transactions on Information Technology in Biomedicine|
|Publication status||Published - May 2007|