A systematic comparison of various statistical alignment models
Computational Linguistics
Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Inducing a multilingual dictionary from a parallel multitext in related languages
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Unsupervised multilingual learning for POS tagging
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Weakly supervised part-of-speech tagging for morphologically-rich, resource-scarce languages
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Unsupervised multilingual grammar induction
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Multilingual part-of-speech tagging: two unsupervised approaches
Journal of Artificial Intelligence Research
Phylogenetic grammar induction
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Posterior Regularization for Structured Latent Variable Models
The Journal of Machine Learning Research
NLPLING '10 Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground
Improved unsupervised POS induction using intrinsic clustering quality and a Zipfian constraint
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Unsupervised multilingual learning
Unsupervised multilingual learning
Multi-source transfer of delexicalized dependency parsers
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Universal morphological analysis using structured nearest neighbor prediction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Leveraging supplemental representations for sequential transduction
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Nudging the envelope of direct transfer methods for multilingual named entity recognition
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Universal grapheme-to-phoneme prediction over Latin alphabets
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
We investigate the problem of unsupervised part-of-speech tagging when raw parallel data is available in a large number of languages. Patterns of ambiguity vary greatly across languages and therefore even unannotated multilingual data can serve as a learning signal. We propose a non-parametric Bayesian model that connects related tagging decisions across languages through the use of multilingual latent variables. Our experiments show that performance improves steadily as the number of languages increases.