Pairwise similarity for cluster ensemble problem: Link-based and approximate approaches

Natthakan Iam-On*, Tossapon Boongoen

*Awdur cyfatebol y gwaith hwn

Allbwn ymchwil: Pennod mewn Llyfr/Adroddiad/Trafodion CynhadleddPennod

Crynodeb

Cluster ensemble methods have emerged as powerful techniques, aggregating several input data clusterings to generate a single output clustering, with improved robustness and stability. In particular, link-based similarity techniques have recently been introduced with superior performance to the conventional co-association method. Their potential and applicability are, however limited due to the underlying time complexity. In light of such shortcoming, this paper presents two approximate approaches that mitigate the problem of time complexity: the approximate algorithm approach (Approximate SimRank Based Similarity matrix) and the approximate data approach (Prototype-based cluster ensemble model). The first approach involves decreasing the computational requirement of the existing link-based technique; the second reduces the size of the problem by finding a smaller, representative, approximate dataset, derived by a density-biased sampling technique. The advantages of both approximate approaches are empirically demonstrated over 22 datasets (both artificial and real data) and statistical comparisons of performance (with 95% confidence level) with three well-known validity criteria. Results obtained from these experiments suggest that approximate techniques can efficiently help scaling up the application of link-based similarity methods to wider range of data sizes.

Iaith wreiddiolSaesneg
TeitlTransactions on Large-Scale Data- and Knowledge-Centered Systems IX
GolygyddionAbdelkader Hameurlain, Josef Küng, Roland Wagner
CyhoeddwrSpringer Nature
Tudalennau95-122
Nifer y tudalennau28
ISBN (Argraffiad)9783642400681
Dynodwyr Gwrthrych Digidol (DOIs)
StatwsCyhoeddwyd - 2013
Cyhoeddwyd yn allanolIe

Cyfres gyhoeddiadau

EnwLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Cyfrol7980
ISSN (Argraffiad)0302-9743
ISSN (Electronig)1611-3349

Ôl bys

Gweld gwybodaeth am bynciau ymchwil 'Pairwise similarity for cluster ensemble problem: Link-based and approximate approaches'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.

Dyfynnu hyn