Unsupervised learning of the morphology of a natural language
Computational Linguistics
Unsupervised word sense disambiguation rivaling supervised methods
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Inducing multilingual text analysis tools via robust projection across aligned corpora
HLT '01 Proceedings of the first international conference on Human language technology research
Knowledge-free induction of inflectional morphologies
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Minimally supervised morphological analysis by multimodal alignment
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Knowledge-free induction of morphology using latent semantic analysis
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Unsupervised learning of morphology using a novel directed search algorithm: taking the first step
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
Unsupervised discovery of morphemes
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
MPL '02 Proceedings of the ACL-02 workshop on Morphological and phonological learning - Volume 6
An algorithm for the unsupervised learning of morphology
Natural Language Engineering
Unsupervised models for morpheme segmentation and morphology learning
ACM Transactions on Speech and Language Processing (TSLP)
A framework for unsupervised natural language morphology induction
ACLstudent '04 Proceedings of the ACL 2004 workshop on Student research
Nonparametric bayesian models of lexical acquisition
Nonparametric bayesian models of lexical acquisition
Evaluating automation strategies in language documentation
HLT '09 Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing
Unsupervised multilingual learning for POS tagging
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Induction of a simple morphology for highly-inflecting languages
SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
Learning probabilistic paradigms for morphology in a latent class model
SIGPHON '06 Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology and Morphology
A naive theory of affixation and an algorithm for extraction
SIGPHON '06 Proceedings of the Eighth Meeting of the ACL Special Interest Group on Computational Phonology and Morphology
Morphology induction from term clusters
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Hi-index | 0.00 |
Many approaches to unsupervised morphology acquisition incorporate the frequency of character sequences with respect to each other to identify word stems and affixes. This typically involves heuristic search procedures and calibrating multiple arbitrary thresholds. We present a simple approach that uses no thresholds other than those involved in standard application of X2 significance testing. A key part of our approach is using document boundaries to constrain generation of candidate stems and affixes and clustering morphological variants of a given word stem. We evaluate our model on English and the Mayan language Uspanteko; it compares favorably to two benchmark systems which use considerably more complex strategies and rely more on experimentally chosen threshold values.