Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Unsupervised learning of the morphology of a natural language
Computational Linguistics
Two languages are more informative than one
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Inducing multilingual text analysis tools via robust projection across aligned corpora
HLT '01 Proceedings of the first international conference on Human language technology research
Knowledge-free induction of inflectional morphologies
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Minimally supervised morphological analysis by multimodal alignment
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Bootstrapping parsers via syntactic projection across parallel texts
Natural Language Engineering
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Unsupervised models for morpheme segmentation and morphology learning
ACM Transactions on Speech and Language Processing (TSLP)
An unsupervised morpheme-based HMM for hebrew morphological disambiguation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Optimal constituent alignment with edge covers for semantic projection
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Unsupervised multilingual learning for POS tagging
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Cross-lingual propagation for morphological analysis
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Unsupervised morphological segmentation with log-linear models
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Linguistically naïve != language independent: why NLP needs linguistic typology
ILCL '09 Proceedings of the EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous?
Unsupervised multilingual grammar induction
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Multilingual part-of-speech tagging: two unsupervised approaches
Journal of Artificial Intelligence Research
Phylogenetic grammar induction
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Learning better monolingual models with unannotated bilingual text
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Universal grapheme-to-phoneme prediction over Latin alphabets
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
In this paper, we consider the problem of unsupervised morphological analysis from a new angle. Past work has endeavored to design unsupervised learning methods which explicitly or implicitly encode inductive biases appropriate to the task at hand. We propose instead to treat morphological analysis as a structured prediction problem, where languages with labeled data serve as training examples for unlabeled languages, without the assumption of parallel data. We define a universal morphological feature space in which every language and its morphological analysis reside. We develop a novel structured nearest neighbor prediction method which seeks to find the morphological analysis for each unlabeled language which lies as close as possible in the feature space to a training language. We apply our model to eight inflecting languages, and induce nominal morphology with substantially higher accuracy than a traditional, MDL-based approach. Our analysis indicates that accuracy continues to improve substantially as the number of training languages increases.