Deducing linguistic structure from the statistics of large corpora
HLT '90 Proceedings of the workshop on Speech and Natural Language
Class-based n-gram models of natural language
Computational Linguistics
Improving statistical language model performance with automatically generated word hierarchies
Computational Linguistics
Similarity-based approaches to natural language processing
Similarity-based approaches to natural language processing
Unsupervised learning of the morphology of a natural language
Computational Linguistics
Distributional part-of-speech tagging
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Hybrid modeling, hmm/nn architectures, and protein applications
Neural Computation
Research on Language and Computation
Hi-index | 0.00 |
This paper presents an approach to the unsupervised learning of parts of speech which uses both morphological and syntactic information. While the model is more complex than those which have been employed for unsupervised learning of POS tags in English, which use only syntactic information, the variety of languages in the world requires that we consider morphology as well. In many languages, morphology provides better clues to a word's category than word order. We present the computational model for POS learning, and present results for applying it to Bulgarian, a Slavic language with relatively free word order and rich morphology.