A new method of parameter estimation for multinomial naive bayes text classifiers

  • Authors:
  • Sang-Bum Kim;Hae-Chang Rim;Heui-Seok Lim

  • Affiliations:
  • Korea University, SEOUL, KOREA;Korea University, SEOUL, KOREA;Chonan University, ChungChong-NamDo, KOREA

  • Venue:
  • SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multinomial naive Bayes classifiers have been widely used for the probabilistic text classification. However, their parameter estimation method sometimes generates inappropriate probabilities. In this paper, we propose a topic document model approach for naive Bayes text classification, where their parameters are estimated with an expectation from the training documents. Experiments are conducted on Reuters 21578 and 20 Newsgroup collection, and our proposed approach obtained a significant improvement in performace over the conventional approach.