Feature selection refers to the problem of selecting those input features that are most predictive of a given outcome; a problem encountered in many areas such as machine learning, pattern recognition and signal processing. In particular, solution to this has found successful application in tasks that involve datasets containing huge numbers of features (in the order of tens of thousands), which would be impossible to process further. Recent examples include text processing and web content classification. Rough set theory has been used as such a dataset pre-processor with much success, but current methods are inadequate at finding minimal reductions, the smallest sets of features possible. To alleviate this difficulty, a feature selection technique that employs a hybrid variant of rough sets, fuzzy-rough sets, has been developed recently and has been shown to be effective. However, this method is still not able to find the optimal subsets regularly. This paper proposes a new feature selection mechanism based on Ant Colony Optimization in an attempt to combat this. The method is then applied to the problem of finding optimal feature subsets in the fuzzy-rough data reduction process. The present work is applied to complex systems monitoring and experimentally compared with the original fuzzy-rough method, an entropy-based feature selector, and a transformation-based reduction method, PCA.