HunPos: an open source trigram tagger

  • Authors:
  • Péter Halácsy;András Kornai;Csaba Oravecz

  • Affiliations:
  • Budapest U. of Technology, Budapest, Stoczek;MetaCarta Inc., Cambridge, MA;Institute of Linguistics, Budapest, Benczur

  • Venue:
  • ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the world of non-proprietary NLP software the standard, and perhaps the best, HMM-based POS tagger is TnT (Brants, 2000). We argue here that some of the criticism aimed at HMM performance on languages with rich morphology should more properly be directed at TnT's peculiar license, free but not open source, since it is those details of the implementation which are hidden from the user that hold the key for improved POS tagging across a wider variety of languages. We present HunPos, a free and open source (LGPL-licensed) alternative, which can be tuned by the user to fully utilize the potential of HMM architectures, offering performance comparable to more complex models, but preserving the ease and speed of the training and tagging process.