Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
An example-based mapping method for text categorization and retrieval
ACM Transactions on Information Systems (TOIS)
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
OCFS: optimal orthogonal centroid feature selection for text categorization
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Comparison of feature selection methods for sentiment analysis
AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
Maximum entropy provides a reasonable way of estimating probability distributions and has been widely used for a number of language processing tasks. In this paper, we explore the use of different feature selection methods for text categorization using maximum entropy modeling. We also propose a new feature selection method based on the difference between the relative document frequencies of a feature for both relevant and irrelevant classes. Our experiments on the Reuters RCV1 data set show that our own feature selection performs better than the other feature selection methods and maximum entropy modeling is a competitive method for text categorization.