Biographies or blenders: which resource is best for cross-domain sentiment analysis?

Authors:
Natalia Ponomareva;Mike Thelwall
Affiliations:
Statistical Cybermetrics Research group, University of Wolverhampton, UK;Statistical Cybermetrics Research group, University of Wolverhampton, UK
Venue:
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Year:
2012

Citing 8
Cited 2

Opinion Mining and Sentiment Analysis

Foundations and Trends in Information Retrieval
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Domain adaptation for statistical classifiers

Journal of Artificial Intelligence Research
Using emoticons to reduce dependency in machine learning techniques for sentiment classification

ACLstudent '05 Proceedings of the ACL Student Research Workshop
Graph ranking for sentiment transfer

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Cross-domain sentiment classification via spectral feature alignment

Proceedings of the 19th international conference on World wide web
Using domain similarity for performance estimation

DANLP 2010 Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing
Effective measures of domain similarity for parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1

Do neighbours help?: an exploration of graph-based algorithms for cross-domain sentiment classification

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Taxonomy-based regression model for cross-domain sentiment classification

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Domain adaptation is usually discussed from the point of view of new algorithms that minimise performance loss when applying a classifier trained on one domain to another. However, finding pertinent data similar to the test domain is equally important for achieving high accuracy in a cross-domain task. This study proposes an algorithm for automatic estimation of performance loss in the context of cross-domain sentiment classification. We present and validate several measures of domain similarity specially designed for the sentiment classification task. We also introduce a new characteristic, called domain complexity, as another independent factor influencing performance loss, and propose various functions for its approximation. Finally, a linear regression for modeling accuracy loss is built and tested in different evaluation settings. As a result, we are able to predict the accuracy loss with an average error of 1.5% and a maximum error of 3.4%.