Abstract
Cluster ensembles organically integrate individual component methods which may utilise different parameter settings and features, and which may themselves be generated on the basis of different representations and learning mechanisms. Such a technique offers an effective means for aggregating multiple clustering results in order to improve the overall clustering accuracy and robustness. Many topics regarding cluster ensembles have been proposed and promising results are gained in the literature. To reinforce such development, this paper presents another cluster ensemble approach for fuzzy clustering, with an aim to be applied for clustering of big data. The proposed algorithm first generates fuzzy base clusters with respect to each data feature and then, employs a fuzzy hierarchical graph to represent the relationships between the resulting base clusters. Whilst the work employs fuzzy c-means and hierarchical clustering in generating base cluster and implementing consensus function respectively, when applied to large datasets it has lower time complexity than the original fuzzy c-means and hierarchical clustering. The resultant ensemble clustering mechanism is tested against traditional clustering methods on various benchmark datasets. Experimental results demonstrate that it generally outperforms crisp cluster ensembles and single linkage agglomerative clustering, in terms of accuracy in conjunction with time efficiency, thereby showing that it has the potential for application in clustering big data.
Original language | English |
---|---|
Pages (from-to) | 2409-2421 |
Journal | Journal of Intelligent and Fuzzy Systems |
Volume | 28 |
Issue number | 6 |
DOIs | |
Publication status | Published - 10 Aug 2015 |
Keywords
- fuzzy cluster ensemble
- big data clustering
- fuzzy c-means
- hierarchical clustering
- data mining