LCE: A link-based cluster ensemble method for improved gene expression data analysis

Natthakan Iam-On*, Tossapon Boongoen, Simon Garrett

*Awdur cyfatebol y gwaith hwn

Allbwn ymchwil: Cyfraniad at gyfnodolynErthygladolygiad gan gymheiriaid

102 Dyfyniadau(SciVal)
16 Wedi eu Llwytho i Lawr (Pure)

Crynodeb

Motivation: It is far from trivial to select the most effective clustering method and its parameterization, for a particular set of gene expression data, because there are a very large number of possibilities. Although many researchers still prefer to use hierarchical clustering in one form or another, this is often sub-optimal. Cluster ensemble research solves this problem by automatically combining multiple data partitions from different clusterings to improve both the robustness and quality of the clustering result. However, many existing ensemble techniques use an association matrix to summarize sample-cluster co-occurrence statistics, and relations within an ensemble are encapsulated only at coarse level, while those existing among clusters are completely neglected. Discovering these missing associations may greatly extend the capability of the ensemble methodology for microarray data clustering. Results: The link-based cluster ensemble (LCE) method, presented here, implements these ideas and demonstrates outstanding performance. Experiment results on real gene expression and synthetic datasets indicate that LCE: (i) usually outperforms the existing cluster ensemble algorithms in individual tests and, overall, is clearly class-leading; (ii) generates excellent, robust performance across different types of data, especially with the presence of noise and imbalanced data clusters; (iii) provides a high-level data matrix that is applicable to many numerical clustering techniques; and (iv) is computationally efficient for large datasets and gene clustering. Availability: Online supplementary and implementation are available at: http://users.aber.ac.uk/nii07/bioinformatics2010. Contact: nii07@aber.ac.uk; natthakan@mfu.ac.th. Supplementary information: Supplementary data are available at Bioinformatics online.

Iaith wreiddiolSaesneg
Rhif yr erthyglbtq226
Tudalennau (o-i)1513-1519
Nifer y tudalennau7
CyfnodolynBioinformatics
Cyfrol26
Rhif cyhoeddi12
Dyddiad ar-lein cynnar05 Mai 2010
Dynodwyr Gwrthrych Digidol (DOIs)
StatwsCyhoeddwyd - 15 Meh 2010

Ôl bys

Gweld gwybodaeth am bynciau ymchwil 'LCE: A link-based cluster ensemble method for improved gene expression data analysis'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.

Dyfynnu hyn