Projects per year
Abstract
The work presented in this paper aims to develop new imputation methods to better handle missing values encountered in astronomical data analysis, especially the classification of transient events in a sky survey from the Gravitational wave Optical Transient Observatory (GOTO) project. In particular, the framework of cluster directed selection of neighbors that has proven effective for benchmark local imputation techniques of KNNimpute and LLSimpute are extended to new multi-stage models. The proposed models, namely Iterative-CKNN and Iterative-CLLS, are novel with an original application to analyze sky survey data. They bring out advantages from both local approaches, where estimates are summarized from neighbors in the same data cluster, within the iterative process to refine previous guesses. Based on experiments with simulated datasets corresponding to different survey sizes and missing rations between 1 to 20%, they usually outperform baseline models and Bayesian Principal Component Analysis (BPCA), which is the well-known global technique. For instance, at 10% missing rate, Iterative-CLLS appears to be the most accurate with NRMSE score of 0.190, while BPCA and the best among its baseline models reaches 0.351 and 0.249, respectively. For their practical implications, these methods have proven to be effective for classifying transients, using common algorithms like KNN, Naive Bayes and Random Forest.
Original language | English |
---|---|
Article number | 102881 |
Journal | Information Processing and Management |
Volume | 59 |
Issue number | 2 |
Early online date | 03 Feb 2022 |
DOIs | |
Publication status | Published - 01 Mar 2022 |
Keywords
- Astronomy
- Clustering
- Imputation
- Missing value
- Sky survey
Fingerprint
Dive into the research topics of 'Estimation of missing values in astronomical survey data: An improved local approach using cluster directed neighbor selection'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Robust burnt scar profiling using deep learning and ensemble modelling with Remote sensing data
Shen, Q. (PI)
17 Feb 2021 → 16 Feb 2022
Project: Externally funded research