TY - JOUR
T1 - Fuzzy-Rough Sets Assisted Attribute Selection
AU - Shen, Qiang
AU - Jensen, Richard
N1 - R. Jensen and Q. Shen. Fuzzy-Rough Sets Assisted Attribute Selection. IEEE Transactions on Fuzzy Systems, vol. 15, no. 1, pp. 73-89, 2007.
PY - 2007
Y1 - 2007
N2 - Attribute selection (AS) refers to the problem of selecting those input attributes or features that are most predictive of a given outcome; a problem encountered in many
areas such as machine learning, pattern recognition and signal processing. Unlike other dimensionality reduction methods, attribute selectors preserve the original meaning of the attributes after reduction. This has found application in tasks that involve datasets containing huge numbers of attributes (in the order of tens of thousands) which, for some learning algorithms, might be impossible to process further. Recent examples include text
processing and web content classification. AS techniques have also been applied to small and medium-sized datasets in order
to locate the most informative attributes for later use.
One of the many successful applications of rough set theory has been to this area. The rough set ideology of using only the
supplied data and no other information has many benefits in AS, where most other methods require supplementary knowledge.
However, the main limitation of rough set-based attribute selection in the literature is the restrictive requirement that all
data is discrete. In classical rough set theory, it is not possible to consider real-valued or noisy data. This paper investigates a novel approach based on fuzzy-rough sets, fuzzy rough feature
selection (FRFS), that addresses these problems and retains dataset semantics.
FRFS is applied to two challenging domains where a feature reducing step is important; namely, web content classification and complex systems monitoring. The utility of this approach is demonstrated and is compared empirically with several dimensionality reducers. In the experimental studies, FRFS is shown to equal or improve classification accuracy when compared to
the results from unreduced data. Classifiers that use a lower dimensional set of attributes which are retained by fuzzy-rough
reduction outperform those that employ more attributes returned by the existing crisp rough reduction method. In addition, it is
shown that FRFS is more powerful than the other AS techniques in the comparative study.
AB - Attribute selection (AS) refers to the problem of selecting those input attributes or features that are most predictive of a given outcome; a problem encountered in many
areas such as machine learning, pattern recognition and signal processing. Unlike other dimensionality reduction methods, attribute selectors preserve the original meaning of the attributes after reduction. This has found application in tasks that involve datasets containing huge numbers of attributes (in the order of tens of thousands) which, for some learning algorithms, might be impossible to process further. Recent examples include text
processing and web content classification. AS techniques have also been applied to small and medium-sized datasets in order
to locate the most informative attributes for later use.
One of the many successful applications of rough set theory has been to this area. The rough set ideology of using only the
supplied data and no other information has many benefits in AS, where most other methods require supplementary knowledge.
However, the main limitation of rough set-based attribute selection in the literature is the restrictive requirement that all
data is discrete. In classical rough set theory, it is not possible to consider real-valued or noisy data. This paper investigates a novel approach based on fuzzy-rough sets, fuzzy rough feature
selection (FRFS), that addresses these problems and retains dataset semantics.
FRFS is applied to two challenging domains where a feature reducing step is important; namely, web content classification and complex systems monitoring. The utility of this approach is demonstrated and is compared empirically with several dimensionality reducers. In the experimental studies, FRFS is shown to equal or improve classification accuracy when compared to
the results from unreduced data. Classifiers that use a lower dimensional set of attributes which are retained by fuzzy-rough
reduction outperform those that employ more attributes returned by the existing crisp rough reduction method. In addition, it is
shown that FRFS is more powerful than the other AS techniques in the comparative study.
U2 - 10.1109/TFUZZ.2006.889761
DO - 10.1109/TFUZZ.2006.889761
M3 - Article
VL - 15
SP - 73
EP - 89
JO - IEEE Transactions on Fuzzy Systems
JF - IEEE Transactions on Fuzzy Systems
IS - 1
ER -