On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
HMM-based word alignment in statistical translation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
A backoff model for bootstrapping resources for non-English languages
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
CoNLL-X shared task on multilingual dependency parsing
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Unsupervised multilingual grammar induction
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Dependency grammar induction via bitext projection constraints
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Minimized models for unsupervised part-of-speech tagging
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Multilingual part-of-speech tagging: two unsupervised approaches
Journal of Artificial Intelligence Research
Painless unsupervised learning with features
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Efficient graph-based semi-supervised learning of structured tagging models
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Two decades of unsupervised POS induction: how far have we come?
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Using universal linguistic knowledge to guide grammar induction
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Model-based aligner combination using dual decomposition
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Semi-supervised frame-semantic parsing for unknown predicates
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Evaluating unsupervised learning for natural language processing tasks
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Unsupervised bilingual POS tagging with Markov random fields
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Unsupervised structure prediction with non-parallel multilingual guidance
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Multi-source transfer of delexicalized dependency parsers
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised dependency parsing without gold part-of-speech tags
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Cross-lingual word clusters for direct transfer of linguistic structure
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Graph-based lexicon expansion with sparsity-inducing penalties
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Translation-based projection for multilingual coreference resolution
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Nudging the envelope of direct transfer methods for multilingual named entity recognition
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Multilingual WSD with just a few lines of code: the BabelNet API
ACL '12 Proceedings of the ACL 2012 System Demonstrations
Multilingual named entity recognition using parallel data and metadata from Wikipedia
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A cost sensitive part-of-speech tagging: differentiating serious errors from minor errors
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A graph-based cross-lingual projection approach for weakly supervised relation extraction
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Syntactic transfer using a bilingual lexicon
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Bilingual lexicon extraction from comparable corpora using label propagation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Type-supervised hidden Markov models for part-of-speech tagging with incomplete tag dictionaries
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Fast large-scale approximate graph construction for NLP
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Wiki-ly supervised part-of-speech tagging
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Cross-Lingual Annotation Projection for Weakly-Supervised Relation Extraction
ACM Transactions on Asian Language Information Processing (TALIP)
Chinese-English mixed text normalization
Proceedings of the 7th ACM international conference on Web search and data mining
Aligned-Parallel-Corpora Based Semi-Supervised Learning for Arabic Mention Detection
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
We describe a novel approach for inducing unsupervised part-of-speech taggers for languages that have no labeled training data, but have translated text in a resource-rich language. Our method does not assume any knowledge about the target language (in particular no tagging dictionary is assumed), making it applicable to a wide array of resource-poor languages. We use graph-based label propagation for cross-lingual knowledge transfer and use the projected labels as features in an unsupervised model (Berg-Kirkpatrick et al., 2010). Across eight European languages, our approach results in an average absolute improvement of 10.4% over a state-of-the-art baseline, and 16.7% over vanilla hidden Markov models induced with the Expectation Maximization algorithm.