Instance-Based Learning Algorithms
Machine Learning
Data preparation for data mining
Data preparation for data mining
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Statistical Pattern Recognition: A Review
IEEE Transactions on Pattern Analysis and Machine Intelligence
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Advances in Instance Selection for Instance-Based Learning Algorithms
Data Mining and Knowledge Discovery
A Unifying View on Instance Selection
Data Mining and Knowledge Discovery
Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
An extensive empirical study of feature selection metrics for text classification
The Journal of Machine Learning Research
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Feature selection methods for text classification
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
A survey on the application of genetic programming to classification
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Prototype reduction techniques: A comparison among different approaches
Expert Systems with Applications: An International Journal
Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study
IEEE Transactions on Pattern Analysis and Machine Intelligence
Dimensionality reduction using genetic algorithms
IEEE Transactions on Evolutionary Computation
Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study
IEEE Transactions on Evolutionary Computation
Genetic algorithms in feature and instance selection
Knowledge-Based Systems
SVOIS: Support Vector Oriented Instance Selection for text classification
Information Systems
Hi-index | 0.00 |
Text classification is usually based on constructing a model through learning from training examples to automatically classify text documents. However, as the size of text document repositories grows rapidly, the storage requirement and computational cost of model learning become higher. Instance selection is one solution to solve these limitations whose aim is to reduce the data size by filtering out noisy data from a given training dataset. In this paper, we introduce a novel algorithm for these tasks, namely a biological-based genetic algorithm (BGA). BGA fits a ''biological evolution'' into the evolutionary process, where the most streamlined process also complies with the reasonable rules. In other words, after long-term evolution, organisms find the most efficient way to allocate resources and evolve. Consequently, we can closely simulate the natural evolution of an algorithm, such that the algorithm will be both efficient and effective. The experimental results based on the TechTC-100 and Reuters-21578 datasets show the outperformance of BGA over five state-of-the-art algorithms. In particular, using BGA to select text documents not only results in the largest dataset reduction rate, but also requires the least computational time. Moreover, BGA can make the k-NN and SVM classifiers provide similar or slightly better classification accuracy than GA.