The ant colony optimization meta-heuristic
New ideas in optimization
Future Generation Computer Systems
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Study of Some Properties of Ant-Q
PPSN IV Proceedings of the 4th International Conference on Parallel Problem Solving from Nature
Toward Integrating Feature Selection Algorithms for Classification and Clustering
IEEE Transactions on Knowledge and Data Engineering
Scoring and Selecting Terms for Text Categorization
IEEE Intelligent Systems
IEEE Intelligent Systems
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Text Clustering with Feature Selection by Using Statistical Data
IEEE Transactions on Knowledge and Data Engineering
Text feature selection using ant colony optimization
Expert Systems with Applications: An International Journal
Feature selection for text classification with Naïve Bayes
Expert Systems with Applications: An International Journal
Review: A review of ant algorithms
Expert Systems with Applications: An International Journal
Using Ant Colony Optimization algorithm for solving project management problems
Expert Systems with Applications: An International Journal
AntNet: distributed stigmergetic control for communications networks
Journal of Artificial Intelligence Research
Hadoop: The Definitive Guide
Ant colony system: a cooperative learning approach to the traveling salesman problem
IEEE Transactions on Evolutionary Computation
Hi-index | 12.05 |
Feature selection is an indispensable preprocessing step for effective analysis of high dimensional data. It removes irrelevant features, improves the predictive accuracy and increases the comprehensibility of the model constructed by the classifiers sensitive to features. Finding an optimal feature subset for a problem in an outsized domain becomes intractable and many such feature selection problems have been shown to be NP-hard. Optimization algorithms are frequently designed for NP-hard problems to find nearly optimal solutions with a practical time complexity. This paper formulates the text feature selection problem as a combinatorial problem and proposes an Ant Colony Optimization (ACO) algorithm to find the nearly optimal solution for the same. It differs from the earlier algorithm by Aghdam et al. by including a heuristic function based on statistics and a local search. The algorithm aims at determining a solution that includes 'n' distinct features for each category. Optimization algorithms based on wrapper models show better results but the processes involved in them are time intensive. The availability of parallel architectures as a cluster of machines connected through fast Ethernet has increased the interest on parallelization of algorithms. The proposed ACO algorithm was parallelized and demonstrated with a cluster formed with a maximum of six machines. Documents from 20 newsgroup benchmark dataset were used for experimentation. Features selected by the proposed algorithm were evaluated using Naive bayes classifier and compared with the standard feature selection techniques. It was observed that the performance of the classifier had been improved with the features selected by the enhanced ACO and local search. Error of the classifier decreases over iterations and it was observed that the number of positive features increases with the number of iterations.