Use of linguistic features in context-sensitive text classification

  • Authors:
  • Alex K. S. Wong;John W. T. Lee;Daniel S. Yeung

  • Affiliations:
  • Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong;Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong

  • Venue:
  • ICMLC'05 Proceedings of the 4th international conference on Advances in Machine Learning and Cybernetics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many popular Text Classification (TC) models use simple occurrence of words in a document as features to base their classifications. They commonly assume word occurrences to be statistically independent in their design. Although such assumption does not hold in general, these TC models are robust and efficient in their task. Some recent studies have shown context-sensitive TC approaches were able to perform better in general. On the other hand, although complex linguistic or semantic features may intuitively be more relevant in TC, studies on their effectiveness have produced mixed and inconclusive results. In this paper, we present our investigation on the use of some complex linguistic features with two context-sensitive TC methods. Our experimental results show potential advantages of such approach.