Graph clustering-based discretization approach to microarray data

Kittakorn Sriwanna*, Tossapon Boongoen, Natthakan Iam-On

*Awdur cyfatebol y gwaith hwn

Allbwn ymchwil: Cyfraniad at gyfnodolynErthygladolygiad gan gymheiriaid


Several techniques in data mining require discrete data. In fact, learning with discrete domains often performs better than the case of continuous data. Multivariate discretization is the algorithm that transforms continuous data to discrete one by considering correlations among attributes. Given the benefit of this idea, many multivariate discretization algorithms have been proposed. However, there are a few discretization algorithms that directly apply to microarray or gene expression data, which is high-dimensional and unbalance data. Even so interesting, no multivariate method has been put forward for microarray data analysis. According to the recent published research, graph clustering-based discretization of splitting and merging methods (GraphS and GraphM) usually achieves superior results compared to many well-known discretization algorithms. In this paper, GraphS and GraphM are extended by adding the alpha parameter that is the ratio between the similarity of gene expressions (distance) and the similarity of the class label. Moreover, the extensions consider 3 similarity measures of cosine similarity, Euclidean distance, and Pearson correlation in order to determine the proper pairwise similarity measure. The evaluation against 20 real microarray datasets and 4 classifiers suggests that the results of three classification performances (ACC, AUC, Kappa) and running time of two proposed methods based on cosine similarity, GraphM(C) and GraphS(C) are better than 9 state-of-the-art discretization algorithms.

Iaith wreiddiolSaesneg
Tudalennau (o-i)879-906
Nifer y tudalennau28
CyfnodolynKnowledge and Information Systems
Rhif cyhoeddi2
Dyddiad ar-lein cynnar05 Medi 2018
Dynodwyr Gwrthrych Digidol (DOIs)
StatwsCyhoeddwyd - 01 Awst 2019
Cyhoeddwyd yn allanolIe

Ôl bys

Gweld gwybodaeth am bynciau ymchwil 'Graph clustering-based discretization approach to microarray data'. Gyda’i gilydd, maen nhw’n ffurfio ôl bys unigryw.

Dyfynnu hyn