An Enhanced Univariate Discretization Based on Cluster Ensembles

Kittakorn Sriwanna*, Natthakan Iam-On, Tossapon Boongoen

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference Proceeding (Non-Journal item)

Abstract

Most discretization algorithms focus on the univariate case. In general, they take into account the target class or interval-wise frequency of data. In so doing, useful information regarding natural group, hidden pattern and correlation among the attributes may be inevitably lost. In response, this paper introduces a new pruning method that exploits natural groups or clusters as an explicit constraint to traditional cut-point determination techniques. This unsupervised approach makes use of cluster ensembles to reveal similarities between data belonging to adjacent intervals. To be precise, a cut-point between a pair of highly similar or related intervals will be dropped. This pruning mechanism is coupled with three different univariate discretization algorithms, with the evaluation is conducted on 10 datasets and 3 classifier models. The results suggest that the proposed method usually achieve higher classification accuracy levels, than those of the three baseline counterparts.

Original languageEnglish
Title of host publicationIntelligent and Evolutionary Systems
Subtitle of host publicationThe 19th Asia Pacific Symposium, IES 2015, Bangkok, Thailand, November 2015, Proceedings
EditorsKittichai Lavangnananda, Somnuk Phon-Amnuaisuk, Worrawat Engchuan, Jonathan H. Chan
PublisherSpringer Nature
Pages85-98
Number of pages14
ISBN (Electronic)978-3-319-27000-5
ISBN (Print)978-3-319-26999-3, 978-3-319-38743-7
DOIs
Publication statusPublished - 12 Nov 2015
Externally publishedYes

Publication series

Name Proceedings in Adaptation, Learning and Optimization (PALO)
PublisherSpringer
Number1
Volume5
ISSN (Print)2363-6084
ISSN (Electronic)2363-6092

Keywords

  • discretization
  • clustering
  • cluster ensembles
  • data mining

Fingerprint

Dive into the research topics of 'An Enhanced Univariate Discretization Based on Cluster Ensembles'. Together they form a unique fingerprint.

Cite this