Joint bilingual sentiment classification with unlabeled parallel corpora

Authors:
Bin Lu;Chenhao Tan;Claire Cardie;Benjamin K. Tsou
Affiliations:
City University of Hong Kong, Hong Kong and Hong Kong Institute of Education, Hong Kong;Cornell University, Ithaca, NY;Cornell University, Ithaca, NY;City University of Hong Kong, Hong Kong and Hong Kong Institute of Education, Hong Kong
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Year:
2011

Citing 33
Cited 5

On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
Word sense disambiguation using a second language monolingual corpus

Computational Linguistics
A maximum entropy approach to natural language processing

Computational Linguistics
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Making large-scale support vector machine learning practical

Advances in kernel methods
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A systematic comparison of various statistical alignment models

Computational Linguistics
Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora

Computational Linguistics
Thumbs up?: sentiment classification using machine learning techniques

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Alignment by agreement

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Novel estimation methods for unsupervised discovery of latent structure in natural language text

Novel estimation methods for unsupervised discovery of latent structure in natural language text
Opinion Mining and Sentiment Analysis

Foundations and Trends in Information Retrieval
Mining opinion features in customer reviews

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Domain adaptation with structural correspondence learning

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Multilingual subjectivity analysis using machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Learning with compositional semantics as structural inference for subsentential sentiment analysis

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Two languages are better than one (for syntactic parsing)

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Identifying expressions of opinion in context

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Cross language dependency parsing using a bilingual lexicon

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Co-training for cross-lingual sentiment classification

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Exploiting bilingual information to improve web search

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Introduction to Semi-Supervised Learning

Introduction to Semi-Supervised Learning
Combining coregularization and consensus-based self-training for multilingual text categorization

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Dependency tree-based sentiment classification using CRFs with hidden variables

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Cross-language text classification using structural correspondence learning

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Cross-lingual latent topic extraction

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Learning better monolingual models with unannotated bilingual text

CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Holistic sentiment analysis across languages: multilingual supervised latent Dirichlet allocation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Multilingual subjectivity: are more languages better?

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics

Generating syntactic tree templates for feature-based opinion mining

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Multilingual WSD with just a few lines of code: the BabelNet API

ACL '12 Proceedings of the ACL 2012 System Demonstrations
Cross-lingual mixture model for sentiment classification

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Joining forces pays off: multilingual joint word sense disambiguation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
SAMAR: Subjectivity and sentiment analysis for Arabic social media

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most previous work on multilingual sentiment analysis has focused on methods to adapt sentiment resources from resource-rich languages to resource-poor languages. We present a novel approach for joint bilingual sentiment classification at the sentence level that augments available labeled data in each language with unlabeled parallel data. We rely on the intuition that the sentiment labels for parallel sentences should be similar and present a model that jointly learns improved monolingual sentiment classifiers for each language. Experiments on multiple data sets show that the proposed approach (1) outperforms the monolingual baselines, significantly improving the accuracy for both languages by 3.44%--8.12%; (2) outperforms two standard approaches for leveraging unlabeled data; and (3) produces (albeit smaller) performance gains when employing pseudo-parallel data from machine translation engines.