A practical solution to the problem of automatic part-of-speech induction from text

  • Authors:
  • Reinhard Rapp

  • Affiliations:
  • University of Mainz, Germersheim, Germany

  • Venue:
  • ACLdemo '05 Proceedings of the ACL 2005 on Interactive poster and demonstration sessions
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of part-of-speech induction from text involves two aspects: Firstly, a set of word classes is to be derived automatically. Secondly, each word of a vocabulary is to be assigned to one or several of these word classes. In this paper we present a method that solves both problems with good accuracy. Our approach adopts a mixture of statistical methods that have been successfully applied in word sense induction. Its main advantage over previous attempts is that it reduces the syntactic space to only the most important dimensions, thereby almost eliminating the otherwise omnipresent problem of data sparseness.