Unsupervised learning of Bulgarian POS tags

Authors:
Derrick Higgins
Affiliations:
Educational Testing Service
Venue:
MorphSlav '03 Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages
Year:
2003

Citing 7
Cited 1

Deducing linguistic structure from the statistics of large corpora

HLT '90 Proceedings of the workshop on Speech and Natural Language
Class-based n-gram models of natural language

Computational Linguistics
Improving statistical language model performance with automatically generated word hierarchies

Computational Linguistics
Similarity-based approaches to natural language processing

Similarity-based approaches to natural language processing
Unsupervised learning of the morphology of a natural language

Computational Linguistics
Distributional part-of-speech tagging

EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Hybrid modeling, hmm/nn architectures, and protein applications

Neural Computation

Investigating the Relationship Between Linguistic Representation and Computation through an Unsupervised Model of Human Morphology Learning

Research on Language and Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an approach to the unsupervised learning of parts of speech which uses both morphological and syntactic information. While the model is more complex than those which have been employed for unsupervised learning of POS tags in English, which use only syntactic information, the variety of languages in the world requires that we consider morphology as well. In many languages, morphology provides better clues to a word's category than word order. We present the computational model for POS learning, and present results for applying it to Bulgarian, a Slavic language with relatively free word order and rich morphology.