Document classification using a finite mixture model

  • Authors:
  • Hang Li;Kenji Yamanishi

  • Affiliations:
  • C&C Res. Labs., NEC, Miyazaki Miyamae-ku Kawasaki, Japan;C&C Res. Labs., NEC, Miyazaki Miyamae-ku Kawasaki, Japan

  • Venue:
  • ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a new method of classifying documents into categories. We define for each category a finite mixture model based on soft clustering of words. We treat the problem of classifying documents as that of conducting statistical hypothesis testing over finite mixture models, and employ the EM algorithm to efficiently estimate parameters in a finite mixture model. Experimental results indicate that our method outperforms existing methods.