Automatic TIMEX2 tagging of Korean news

  • Authors:
  • Seok Bae Jang;Jennifer Baldwin;Inderjeet Mani

  • Affiliations:
  • Georgetown University;Georgetown University;Georgetown University

  • Venue:
  • ACM Transactions on Asian Language Information Processing (TALIP) - Special Issue on Temporal Information Processing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article reports on a temporal tagger for Korean based on a Korean extension of the TIDES TIMEX2 guidelines. The extension, which primarily addresses the idiosyncrasies of Korean morphology, shows high inter-annotator reliability (0.893 F-measure for tag extent) when applied to a corpus of Korean newspaper articles. A machine-learning approach based on rote learning from a human-edited, automatically-derived dictionary of temporal expressions is compared with a second approach that adds manual patterns, and a third onethat tries to learn the patterns. Results for the first two are promising (0.87 F-measure for tag extent). Overall, the article shows that rote learning approaches can be very useful when language-specific features such as morphology are taken into account.