An evaluation of phrasal and clustered representations on a text categorization task
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
The use of bigrams to enhance text categorization
Information Processing and Management: an International Journal
Maximizing Text-Mining Performance
IEEE Intelligent Systems
Hi-index | 0.00 |
This paper proposes a new technique for dimensionality reduction of features for text categorization. Unlike conventional method, our phrase features are generated based on word sequences of different length (Multigrams) from phrases extracted from whole documents. Then, we utilize Odds ratio (OR) to perform phase feature selection. From preliminary experiments, the proposed techniques show better performance than that of conventional methods.