TY - JOUR
T1 - Fuzzy-Rough Data Reduction with Ant Colony Optimization
AU - Jensen, Richard
AU - Shen, Qiang
N1 - R. Jensen and Q. Shen, 'Fuzzy-Rough Data Reduction with Ant Colony Optimization,' Fuzzy Sets and Systems, vol. 149, no. 1, pp. 5-20, 2005.
PY - 2005/1/1
Y1 - 2005/1/1
N2 - Feature selection refers to the problem of selecting those input features that are most predictive of a given outcome; a problem encountered in many areas such as machine learning, pattern recognition and signal processing. In particular, solution to this has found successful application in tasks that involve datasets containing huge numbers of features (in the order of tens of thousands), which would be impossible to process further. Recent examples include text processing and web content classification. Rough set theory has been used as such a dataset pre-processor with much success, but current methods are inadequate at finding minimal reductions, the smallest sets of features possible. To alleviate this difficulty, a feature selection technique that employs a hybrid variant of rough sets, fuzzy-rough sets, has been developed recently and has been shown to be effective. However, this method is still not able to find the optimal subsets regularly. This paper proposes a new feature selection mechanism based on Ant Colony Optimization in an attempt to combat this. The method is then applied to the problem of finding optimal feature subsets in the fuzzy-rough data reduction process. The present work is applied to complex systems monitoring and experimentally compared with the original fuzzy-rough method, an entropy-based feature selector, and a transformation-based reduction method, PCA.
AB - Feature selection refers to the problem of selecting those input features that are most predictive of a given outcome; a problem encountered in many areas such as machine learning, pattern recognition and signal processing. In particular, solution to this has found successful application in tasks that involve datasets containing huge numbers of features (in the order of tens of thousands), which would be impossible to process further. Recent examples include text processing and web content classification. Rough set theory has been used as such a dataset pre-processor with much success, but current methods are inadequate at finding minimal reductions, the smallest sets of features possible. To alleviate this difficulty, a feature selection technique that employs a hybrid variant of rough sets, fuzzy-rough sets, has been developed recently and has been shown to be effective. However, this method is still not able to find the optimal subsets regularly. This paper proposes a new feature selection mechanism based on Ant Colony Optimization in an attempt to combat this. The method is then applied to the problem of finding optimal feature subsets in the fuzzy-rough data reduction process. The present work is applied to complex systems monitoring and experimentally compared with the original fuzzy-rough method, an entropy-based feature selector, and a transformation-based reduction method, PCA.
KW - Data reduction
KW - Fuzzy-rough sets
KW - Ant colony optimization
KW - Feature selection
U2 - 10.1016/j.fss.2004.07.014
DO - 10.1016/j.fss.2004.07.014
M3 - Article
SN - 0165-0114
VL - 149
SP - 5
EP - 20
JO - Fuzzy Sets and Systems
JF - Fuzzy Sets and Systems
IS - 1
ER -