TY - GEN
T1 - Improved link-based cluster ensembles for microarray data analysis
AU - Iam-On, Natthakan
AU - Boongoen, Tossapon
PY - 2012
Y1 - 2012
N2 - Cancer has been identified as the leading cause of death. It is predicted that around 20-26 million people will be diagnosed with cancer by 2020. As a result, there is an urgent need for a more effective methodology to prevent and cure cancer. Microarray technology provides a useful basis of achieving this ultimate goal. For cancer research, it has become almost routine to create gene expression profiles, which can discriminate patients into good and poor prognosis groups. This cluster analysis offers a useful basis for individualized treatment of disease. Cluster ensembles have been shown to be better than any standard clustering algorithm for such a task. This meta-learning formalism helps users to overcome the dilemma of selecting an appropriate technique and the parameters for that technique, given a set of data. Among different state-of-the-art methods, the link-based approach (LCE) provides a highly accurate clustering. This paper presents the improvement of LCE with a new link-based similarity measure being developed and engaged. Additional information that is already available in an information network is included in the similarity assessment. As such, this refinement can increase the quality of the measures, hence the resulting cluster decision. The performance of this improved LCE is evaluated on published microarray datasets, in comparison with the original LCE and several well-known cluster ensemble techniques. The findings suggest that the new model can improve the accuracy of LCE and performs better than the others investigated in the empirical study.
AB - Cancer has been identified as the leading cause of death. It is predicted that around 20-26 million people will be diagnosed with cancer by 2020. As a result, there is an urgent need for a more effective methodology to prevent and cure cancer. Microarray technology provides a useful basis of achieving this ultimate goal. For cancer research, it has become almost routine to create gene expression profiles, which can discriminate patients into good and poor prognosis groups. This cluster analysis offers a useful basis for individualized treatment of disease. Cluster ensembles have been shown to be better than any standard clustering algorithm for such a task. This meta-learning formalism helps users to overcome the dilemma of selecting an appropriate technique and the parameters for that technique, given a set of data. Among different state-of-the-art methods, the link-based approach (LCE) provides a highly accurate clustering. This paper presents the improvement of LCE with a new link-based similarity measure being developed and engaged. Additional information that is already available in an information network is included in the similarity assessment. As such, this refinement can increase the quality of the measures, hence the resulting cluster decision. The performance of this improved LCE is evaluated on published microarray datasets, in comparison with the original LCE and several well-known cluster ensemble techniques. The findings suggest that the new model can improve the accuracy of LCE and performs better than the others investigated in the empirical study.
KW - cluster ensembles
KW - clustering
KW - link-based similarity
KW - microarray data
UR - http://www.scopus.com/inward/record.url?scp=84872388398&partnerID=8YFLogxK
U2 - 10.1109/ICSMC.2012.6378034
DO - 10.1109/ICSMC.2012.6378034
M3 - Conference Proceeding (Non-Journal item)
AN - SCOPUS:84872388398
SN - 9781467317139
T3 - Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
SP - 2014
EP - 2019
BT - Proceedings 2012 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2012
T2 - 2012 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2012
Y2 - 14 October 2012 through 17 October 2012
ER -