Enhancing naive bayes with various smoothing methods for short text classification

Authors:
Quan Yuan;Gao Cong;Nadia Magnenat Thalmann
Affiliations:
Nanyang Technological University, Singapore, Singapore;Nanyang Technological University, Singapore, Singapore;Nanyang Technological University, Singapore, Singapore
Venue:
Proceedings of the 21st international conference companion on World Wide Web
Year:
2012

Citing 3
Cited 1

Hierarchically Classifying Documents Using Very Few Words

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A study of smoothing methods for language models applied to information retrieval

ACM Transactions on Information Systems (TOIS)
Learning to classify short and sparse text & web with hidden topics from large-scale data collections

Proceedings of the 17th international conference on World Wide Web

Effectiveness of state-of-the-art features for microblog search

Proceedings of the 28th Annual ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partly due to the proliferance of microblog, short texts are becoming prominent. A huge number of short texts are generated every day, which calls for a method that can efficiently accommodate new data to incrementally adjust classification models. Naive Bayes meets such a need. We apply several smoothing models to Naive Bayes for question topic classification, as an example of short text classification, and study their performance. The experimental results on a large real question data show that the smoothing methods are able to significantly improve the question classification performance of Naive Bayes. We also study the effect of training data size, and question length on performance.