A Feature Selection Method Based on Fisher's Discriminant Ratio for Text Sentiment Classification

  • Authors:
  • Suge Wang;Deyu Li;Yingjie Wei;Hongxia Li

  • Affiliations:
  • School of Mathematics Science, Shanxi University, Taiyuan, China 030006 and Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi Univers ...;School of Computer & Information Technology, Shanxi University, Taiyuan, China 030006 and Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, ...;Science Press, Beijing, China 100717;School of Mathematics Science, Shanxi University, Taiyuan, China 030006

  • Venue:
  • WISM '09 Proceedings of the International Conference on Web Information Systems and Mining
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the rapid growth of e-commerce, product reviews on the Web have become an important information source for customers' decision making when they intend to buy some product. As the reviews are often too many for customers to go through, how to automatically classify them into different sentiment orientation categories (i.e. positive/negative) has become a research problem. In this paper, based on Fisher's discriminant ratio, an effective feature selection method is proposed for product review text sentiment classification. In order to validate the validity of the proposed method, we compared it with other methods respectively based on information gain and mutual information while support vector machine is adopted as the classifier. In this paper, 6 subexperiments are conducted by combining different feature selection methods with 2 kinds of candidate feature sets. Under 1006 review documents of cars, the experimental results indicate that the Fisher's discriminant ratio based on word frequency estimation has the best performance with F value 83.3% while the candidate features are the words which appear in both positive and negative texts.