Feature selection for sentiment analysis based on content and syntax models

Authors:
Adnan Duric;Fei Song
Affiliations:
University of Guelph, Guelph, Ontario, Canada;University of Guelph, Guelph, Ontario, Canada
Venue:
WASSA '11 Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis
Year:
2011

Citing 12
Cited 0

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Sentiment Analyzer: Extracting Sentiments about a Given Topic using Natural Language Processing Techniques

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Using appraisal groups for sentiment analysis

Proceedings of the 14th ACM international conference on Information and knowledge management
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Learning subjective nouns using extraction pattern bootstrapping

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Determining the sentiment of opinions

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Recognizing contextual polarity in phrase-level sentiment analysis

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A holistic lexicon-based approach to opinion mining

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web

Management Science
Mining opinion features in customer reviews

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent solutions for sentiment analysis have relied on feature selection methods ranging from lexicon-based approaches where the set of features are generated by humans, to approaches that use general statistical measures where features are selected solely on empirical evidence. The advantage of statistical approaches is that they are fully automatic, however, they often fail to separate features that carry sentiment from those that do not. In this paper we propose a set of new feature selection schemes that use a Content and Syntax model to automatically learn a set of features in a review document by separating the entities that are being reviewed from the subjective expressions that describe those entities in terms of polarities. By focusing only on the subjective expressions and ignoring the entities, we can choose more salient features for document-level sentiment analysis. The results obtained from using these features in a maximum entropy classifier are competitive with the state-of-the-art machine learning approaches.