A link-based approach to the cluster ensemble problem

Natthakan Iam-On*, Tossapon Boongoen, Simon Garrett, Chris Price

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

193 Citations (SciVal)

Abstract

Cluster ensembles have recently emerged as a powerful alternative to standard cluster analysis, aggregating several input data clusterings to generate a single output clustering, with improved robustness and stability. From the early work, these techniques held great promise; however, most of them generate the final solution based on incomplete information of a cluster ensemble. The underlying ensemble-information matrix reflects only cluster-data point relations, while those among clusters are generally overlooked. This paper presents a new link-based approach to improve the conventional matrix. It achieves this using the similarity between clusters that are estimated from a link network model of the ensemble. In particular, three new link-based algorithms are proposed for the underlying similarity assessment. The final clustering result is generated from the refined matrix using two different consensus functions of feature-based and graph-based partitioning. This approach is the first to address and explicitly employ the relationship between input partitions, which has not been emphasized by recent studies of matrix refinement. The effectiveness of the link-based approach is empirically demonstrated over 10 data sets (synthetic and real) and three benchmark evaluation measures. The results suggest the new approach is able to efficiently extract information embedded in the input clusterings, and regularly illustrate higher clustering quality in comparison to several state-of-the-art techniques.

Original languageEnglish
Article number5765991
Pages (from-to)2396-2409
Number of pages14
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Volume33
Issue number12
Early online date12 May 2011
DOIs
Publication statusPublished - Dec 2011

Keywords

  • Clustering
  • cluster ensembles
  • cluster relations
  • data mining
  • link-based similarity

Fingerprint

Dive into the research topics of 'A link-based approach to the cluster ensemble problem'. Together they form a unique fingerprint.

Cite this