Improving the Consensus Clustering of Data with Missing Values Using the Link-Based Approach

Natthakan Iam-On*

*Corresponding author for this work

Research output: Contribution to journalArticle

Abstract

The aim of data science is to catch up with the data-intensive life style as well as the demand for decision support, which becomes common in various domains such as medical, education, and other smart solutions. As such, high quality of data analysis is greatly desired for accurate and effective downstreaming exploitations. Specific to data clustering, vast amounts of works have concentrated on modeling a distance metric and a clustering algorithm, with the assumption of a complete data. However, this might not always be the case as missing values can occur in the dataset under examination. Instead of filling in these values using an imputation method, a recent study successfully makes use of the consensus clustering to overcome the problem without committing an explicit imputation procedure. This paper extends the previous framework to link-based consensus clustering that provides a more refined summarization of cluster ensemble, hence the resulting data partition. It exhibits a promising performance on several benchmark data collections obtained from UCI repository.
Original languageEnglish
Article number7
Number of pages9
JournalData-Enabled Discovery and Applications
Volume3
Issue number1
DOIs
Publication statusPublished - 29 Mar 2019
Externally publishedYes

Fingerprint

Dive into the research topics of 'Improving the Consensus Clustering of Data with Missing Values Using the Link-Based Approach'. Together they form a unique fingerprint.

Cite this