TY - GEN
T1 - Exploiting Consensus Clustering for Light Curve Data Analysis
AU - Panwong, Patcharaporn
AU - Boongoen, Tossapon
AU - Iam-On, Natthakan
AU - Mullaney, James
N1 - Funding Information:
This work is part of a PhD dissertation and supported by Center of Excellence in AI and Analytic Technology (Mae Fah Luang University) and STFC GCRF project: From Stars to Baht Phase II.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Consensus clustering has been one of the major fields in data science, with increasing numbers of theoretical development and publications over the past twenty years. Recently, a new method for ensemble generation has been introduced with a good use of noise to create diversity via data perturbation. Based on good results with several benchmark data sets, its application to domain-specific problem such as astronomy seems to be an appropriate step ahead. Henceforth, this paper presents an empirical study of the noise-induced consensus clustering with a real data collection, obtained from published LSST light curve catalogue. Note that light curve profiles can be categorized into groups of known astronomical objects with common characteristics and behavior over time. As such, it is important to recognize new or unforeseen objects detected in a sky survey as one of those types, leading to appropriate data collection and further analysis. In particular, two different feature extraction techniques are used to derive features from raw time series records. With these, the performance of simple clustering of k-means and noise-induced ensemble counterpart are compared, using the set of four common clustering validity indices. The experimental results are highlighted with respect to factors of imbalanced data, quality of extracted features and number of clusters. These may help to improve the application of a single or ensemble clustering to light curve data in the future.
AB - Consensus clustering has been one of the major fields in data science, with increasing numbers of theoretical development and publications over the past twenty years. Recently, a new method for ensemble generation has been introduced with a good use of noise to create diversity via data perturbation. Based on good results with several benchmark data sets, its application to domain-specific problem such as astronomy seems to be an appropriate step ahead. Henceforth, this paper presents an empirical study of the noise-induced consensus clustering with a real data collection, obtained from published LSST light curve catalogue. Note that light curve profiles can be categorized into groups of known astronomical objects with common characteristics and behavior over time. As such, it is important to recognize new or unforeseen objects detected in a sky survey as one of those types, leading to appropriate data collection and further analysis. In particular, two different feature extraction techniques are used to derive features from raw time series records. With these, the performance of simple clustering of k-means and noise-induced ensemble counterpart are compared, using the set of four common clustering validity indices. The experimental results are highlighted with respect to factors of imbalanced data, quality of extracted features and number of clusters. These may help to improve the application of a single or ensemble clustering to light curve data in the future.
KW - astronomy
KW - consensus clustering
KW - light curve
KW - noise and feature extraction
UR - http://www.scopus.com/inward/record.url?scp=85078006875&partnerID=8YFLogxK
U2 - 10.1109/ECICE47484.2019.8942715
DO - 10.1109/ECICE47484.2019.8942715
M3 - Conference Proceeding (Non-Journal item)
AN - SCOPUS:85078006875
T3 - 2019 IEEE Eurasia Conference on IOT, Communication and Engineering, ECICE 2019
SP - 498
EP - 501
BT - 2019 IEEE Eurasia Conference on IOT, Communication and Engineering, ECICE 2019
A2 - Meen, Teen-Hang
PB - IEEE Press
T2 - 2019 IEEE Eurasia Conference on IOT, Communication and Engineering, ECICE 2019
Y2 - 3 October 2019 through 6 October 2019
ER -