Cross-lingual relevance models
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
A Comparative Study of Query and Document Translation for Cross-Language Information Retrieval
AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Solving large scale linear prediction problems using stochastic gradient descent algorithms
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Inducing information extraction systems for new languages via cross-language projection
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Cross-language text classification
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An EM Based Training Algorithm for Cross-Language Text Categorization
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
Thumbs up?: sentiment classification using machine learning techniques
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data
The Journal of Machine Learning Research
A high-performance semi-supervised learning method for text chunking
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Exploiting comparable corpora and bilingual dictionaries for cross-language text categorization
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Deeper sentiment analysis using machine translation technology
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Advanced learning algorithms for cross-language patent retrieval and classification
Information Processing and Management: an International Journal
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Proceedings of the 24th international conference on Machine learning
A two-stage approach to domain adaptation for statistical classifiers
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Can chinese web pages be classified with english data source?
Proceedings of the 17th international conference on World Wide Web
Efficient projections onto the l1-ball for learning in high dimensions
Proceedings of the 25th international conference on Machine learning
Sample Selection Bias Correction Theory
ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Dataset Shift in Machine Learning
Dataset Shift in Machine Learning
Sparse Online Learning via Truncated Gradient
The Journal of Machine Learning Research
Domain adaptation with structural correspondence learning
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Hierarchical Bayesian domain adaptation
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Cross language text categorization by acquiring multilingual domain models from comparable corpora
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Co-training for cross-lingual sentiment classification
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Stochastic gradient descent training for L1-regularized log-linear models with cumulative penalty
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Multi-class confidence weighted algorithms
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Discriminative Learning Under Covariate Shift
The Journal of Machine Learning Research
A Wikipedia-based multilingual retrieval model
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
IEEE Transactions on Knowledge and Data Engineering
Cross-language text classification using structural correspondence learning
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Cross lingual adaptation: an experiment on sentiment classifications
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Domain adaptation with unlabeled data for dialog act tagging
DANLP 2010 Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing
Source-selection-free transfer learning
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Cross-lingual mixture model for sentiment classification
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Hi-index | 0.00 |
Cross-lingual adaptation is a special case of domain adaptation and refers to the transfer of classification knowledge between two languages. In this article we describe an extension of Structural Correspondence Learning (SCL), a recently proposed algorithm for domain adaptation, for cross-lingual adaptation in the context of text classification. The proposed method uses unlabeled documents from both languages, along with a word translation oracle, to induce a cross-lingual representation that enables the transfer of classification knowledge from the source to the target language. The main advantages of this method over existing methods are resource efficiency and task specificity. We conduct experiments in the area of cross-language topic and sentiment classification involving English as source language and German, French, and Japanese as target languages. The results show a significant improvement of the proposed method over a machine translation baseline, reducing the relative error due to cross-lingual adaptation by an average of 30% (topic classification) and 59% (sentiment classification). We further report on empirical analyses that reveal insights into the use of unlabeled data, the sensitivity with respect to important hyperparameters, and the nature of the induced cross-lingual word correspondences.