A Morphological Tagger for Korean: Statistical Tagging Combined with Corpus-Based Morphological Rule Application

  • Authors:
  • Chung-Hye Han;Martha Palmer

  • Affiliations:
  • Department of Linguistics, Simon Fraser University, Burnaby, Canada V5A1SC;Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, USA 19104-6389

  • Venue:
  • Machine Translation
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a novel approach to morphological tagging for Korean, an agglutinative language with a very productive inflectional system. The tagger takes raw text as input and returns a lemmatized and morphologically disambiguated output for each word: the lemma is labeled with a part-of-speech (POS) tag and the inflections are labeled with inflectional tags. Unlike the standard approach to tagging for morphologically complex languages, in our proposed approach the tagging phase precedes the analysis phase. It comprises a trigram-based tagging component followed by a morphological rule application component, obtaining 95% precision and recall on unseen test data.