Instance-Based Learning Algorithms
Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Hi-index | 0.00 |
Pre-processing is an important part of machine learning, and has been shown to significantly improve the performance of classifiers. In this paper, we take a selection of pre-processing methods--focusing specifically on discretization and feature selection--and empirically examine their combined effect on classifier performance. In our experiments learning algorithms, namely one-R, ID3, naive Bayes, and IB1, and explore the impact of different forms of preprocessing on each combination of dataset and algorithm. We find that in general the combination of wrapper-based forward selection and naive supervised methods of discretization yield consistently above-baseline results.