Unsupervised learning of word-category guessing rules

  • Authors:
  • Andrei Mikheev

  • Affiliations:
  • University of Edinburgh, Edinburgh, Scotland, UK

  • Venue:
  • ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Words unknown to the lexicon present a substantial problem to part-of-speech tagging. In this paper we present a technique for fully unsupervised statistical acquisition of rules which guess possible parts-of-speech for unknown words. Three complementary sets of word-guessing rules are induced from the lexicon and a raw corpus: prefix morphological rules, suffix morphological rules and ending-guessing rules. The learning was performed on the Brown Corpus data and rule-sets, with a highly competitive performance, were produced and compared with the state-of-the-art.