Is machine translation ripe for cross-lingual sentiment classification?

Authors:
Kevin Duh;Akinori Fujino;Masaaki Nagata
Affiliations:
NTT Communication Science Laboratories, Hikari-dai, Seika-cho, Kyoto, Japan;NTT Communication Science Laboratories, Hikari-dai, Seika-cho, Kyoto, Japan;NTT Communication Science Laboratories, Hikari-dai, Seika-cho, Kyoto, Japan
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Year:
2011

Citing 6
Cited 2

Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Multilingual subjectivity analysis using machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Hierarchical Bayesian domain adaptation

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Co-training for cross-lingual sentiment classification

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Cross-language text classification using structural correspondence learning

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Cross lingual adaptation: an experiment on sentiment classifications

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers

Cross-lingual mixture model for sentiment classification

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Cross-lingual polarity detection with machine translation

Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent advances in Machine Translation (MT) have brought forth a new paradigm for building NLP applications in low-resource scenarios. To build a sentiment classifier for a language with no labeled resources, one can translate labeled data from another language, then train a classifier on the translated text. This can be viewed as a domain adaptation problem, where labeled translations and test data have some mismatch. Various prior work have achieved positive results using this approach. In this opinion piece, we take a step back and make some general statements about cross-lingual adaptation problems. First, we claim that domain mismatch is not caused by MT errors, and accuracy degradation will occur even in the case of perfect MT. Second, we argue that the cross-lingual adaptation problem is qualitatively different from other (monolingual) adaptation problems in NLP; thus new adaptation algorithms ought to be considered. This paper will describe a series of carefully-designed experiments that led us to these conclusions.