Dual word and document seed selection for semi-supervised sentiment classification

  • Authors:
  • Shengfeng Ju;Shoushan Li;Yan Su;Guodong Zhou;Yu Hong;Xiaojun Li

  • Affiliations:
  • Soochow University, Suzhou, China;Soochow University, Suzhou, China;Soochow University, Suzhou, China;Soochow University, Suzhou, China;Soochow University, Suzhou, China;Zhejiang Gongshang University, Hangzhou, China

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Semi-supervised sentiment classification aims to train a classifier with a small number of labeled data (called seed data) and a large amount of unlabeled data. a big advantage of this approach is its saving of annotation effort by using the unlabeled data which is usually freely available. In this paper, we propose an approach to further minimize the annotation effort of semi-supervised sentiment classification by actively selecting the seed data. Specifically, a novel selection strategy is proposed to simultaneously select good words and documents for manual annotation by considering both of their annotation costs and informativeness. Experimental results demonstrate the effectiveness of our approach.