Tagging inflective languages: prediction of morphological categories for a rich, structured tagset
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Using the web to overcome data sparseness
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Bootstrapping a multilingual part-of-speech tagger in one person-day
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Annealing techniques for unsupervised statistical language learning
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Bootstrapping without the boot
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Glen, Glenda or Glendale: unsupervised and semi-supervised learning of English noun gender
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
What's in a name?: in some languages, grammatical gender
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
A case study of using web search statistics: case restoration
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
A search engine approach to estimating temporal changes in gender orientation of first names
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.00 |
This paper investigates the problem of determining grammatical gender for the nouns of a language starting with minimal resources: a very small list of seed nouns for which gender is known or via translingual projection of natural gender. We show that through a bootstrapping process that uses contextual clues from an unannotated corpus and morphological clues modeled with suffix tries, accurate gender predictions can be induced for five diverse test languages.