Dimensionality reduction of features for text categorization

  • Authors:
  • Parisut Jitpakdee;Worapoj Kreesuradej

  • Affiliations:
  • Faculty of Information Technology, King Mongkut's Institute of Technology Ladkrabang, Bangkok, Thailand;Faculty of Information Technology, King Mongkut's Institute of Technology Ladkrabang, Bangkok, Thailand

  • Venue:
  • ACST'07 Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and Technology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a new technique for dimensionality reduction of features for text categorization. Unlike conventional method, our phrase features are generated based on word sequences of different length (Multigrams) from phrases extracted from whole documents. Then, we utilize Odds ratio (OR) to perform phase feature selection. From preliminary experiments, the proposed techniques show better performance than that of conventional methods.