A cross-corpus study of unsupervised subjectivity identification based on calibrated EM

Authors:
Dong Wang;Yang Liu
Affiliations:
The University of Texas at Dallas;The University of Texas at Dallas
Venue:
WASSA '11 Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis
Year:
2011

Citing 12
Cited 1

Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Training a naive bayes classifier via the EM algorithm with a class distribution constraint

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Learning extraction patterns for subjective expressions

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Recognizing contextual polarity in phrase-level sentiment analysis

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Sentiment analysis of blogs by combining lexical knowledge with text classification

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Multimodal subjectivity analysis of multiparty conversation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Summarizing spoken and written conversations

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Employing personal/impersonal views in supervised and semi-supervised sentiment classification

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Creating subjective and objective sentence classifiers from unannotated texts

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

Cross-Lingual projections vs. corpora extracted subjectivity lexicons for less-resourced languages

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this study we investigate using an unsupervised generative learning method for subjectivity detection in text across different domains. We create an initial training set using simple lexicon information, and then evaluate a calibrated EM (expectation-maximization) method to learn from unannotated data. We evaluate this unsupervised learning approach on three different domains: movie data, news resource, and meeting dialogues. We also perform a thorough analysis to examine impacting factors on unsupervised learning, such as the size and self-labeling accuracy of the initial training set. Our experiments and analysis show inherent differences across domains and performance gain from calibration in EM.