Finding structure in noisy text: topic classification and unsupervised clustering
International Journal on Document Analysis and Recognition
Retrieval models for question and answer archives
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
ISDA '08 Proceedings of the 2008 Eighth International Conference on Intelligent Systems Design and Applications - Volume 02
Category Classification and Topic Discovery of Japanese and English News Articles
Electronic Notes in Theoretical Computer Science (ENTCS)
Hi-index | 0.00 |
Classiying user's question into several topics helps respondents answering the question in a cQA service. The word weighting method must estimate the appropriate weight of a word to improve the category (or topic) classification. In this paper, we propose a novel effective word weighting method based on a language model for automatic category classification in the cQA service. We first calculate the occurrence probability of a word in each category by using a language model and then the final weight of each word is estimated by ratio of the occurrence probability of the word on a category to the occurrence probability of the word on the other categories. As a result, the proposed method significantly improves the performance of the category classification.