Automatic TIMEX2 tagging of Korean news

Authors:
Seok Bae Jang;Jennifer Baldwin;Inderjeet Mani
Affiliations:
Georgetown University;Georgetown University;Georgetown University
Venue:
ACM Transactions on Asian Language Information Processing (TALIP) - Special Issue on Temporal Information Processing
Year:
2004

Citing 5
Cited 5

An empirical study of automated dictionary construction for information extraction in three domains

Artificial Intelligence - Special volume on empirical methods
Robust temporal processing of news

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
From temporal expressions to temporal information: semantic tagging of news messages

TASIP '01 Proceedings of the workshop on Temporal and spatial information processing - Volume 13
A pilot study on annotating temporal relations in text

TASIP '01 Proceedings of the workshop on Temporal and spatial information processing - Volume 13
A multilingual approach to annotating and extracting temporal information

TASIP '01 Proceedings of the workshop on Temporal and spatial information processing - Volume 13

What's the date?: high accuracy interpretation of weekday names

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
KTimeML: specification of temporal and event expressions in Korean text

ALR7 Proceedings of the 7th Workshop on Asian Language Resources
Automatic temporal expression normalization with reference time dynamic-choosing

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
CTEMP: a chinese temporal parser for extracting and normalizing temporal information

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Exploiting temporal information in Web search

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article reports on a temporal tagger for Korean based on a Korean extension of the TIDES TIMEX2 guidelines. The extension, which primarily addresses the idiosyncrasies of Korean morphology, shows high inter-annotator reliability (0.893 F-measure for tag extent) when applied to a corpus of Korean newspaper articles. A machine-learning approach based on rote learning from a human-edited, automatically-derived dictionary of temporal expressions is compared with a second approach that adds manual patterns, and a third onethat tries to learn the patterns. Results for the first two are promising (0.87 F-measure for tag extent). Overall, the article shows that rote learning approaches can be very useful when language-specific features such as morphology are taken into account.