The nature of statistical learning theory
The nature of statistical learning theory
A maximum entropy approach to natural language processing
Computational Linguistics
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Analyzing the effectiveness and applicability of co-training
Proceedings of the ninth international conference on Information and knowledge management
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Estimating Word Translation Probabilities from Unrelated Monolingual Corpora Using the EM Algorithm
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Exploiting parallel texts for word sense disambiguation: an empirical study
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Iterative translation disambiguation for cross-language information retrieval
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-language text classification
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Bootstrapping parsers via syntactic projection across parallel texts
Natural Language Engineering
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A DOM tree alignment model for mining parallel data from the web
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Exploiting comparable corpora and bilingual dictionaries for cross-language text categorization
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Multilingual subjectivity analysis using machine translation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Transferring naive bayes classifiers for text classification
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Co-training for cross-lingual sentiment classification
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
A study on optimal parameter tuning for Rocchio text classifier
ECIR'03 Proceedings of the 25th European conference on IR research
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Language-independent sentiment classification using three common words
Proceedings of the 20th ACM international conference on Information and knowledge management
Bilingual co-training for sentiment classification of chinese product reviews
Computational Linguistics
Cross-lingual text classification with model translation and document translation
Proceedings of the 50th Annual Southeast Regional Conference
Active learning for cross language text categorization
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
A Fast and Accurate Method for Bilingual Opinion Lexicon Extraction
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Cross-lingual web spam classification
Proceedings of the 22nd international conference on World Wide Web companion
Hi-index | 0.00 |
In this paper, we introduce a method that automatically builds text classifiers in a new language by training on already labeled data in another language. Our method transfers the classification knowledge across languages by translating the model features and by using an Expectation Maximization (EM) algorithm that naturally takes into account the ambiguity associated with the translation of a word. We further exploit the readily available unlabeled data in the target language via semi-supervised learning, and adapt the translated model to better fit the data distribution of the target language.