Effective use of TimeBank for TimeML analysis

  • Authors:
  • Branimir Boguraev;Rie Kubota Ando

  • Affiliations:
  • IBM T.J. Watson Research Center, Hawthorne, NY;IBM T.J. Watson Research Center, Hawthorne, NY

  • Venue:
  • Proceedings of the 2005 international conference on Annotating, extracting and reasoning about time and events
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

TimeML is an expressive language for temporal information, but its rich representational properties raise the bar for traditional information extraction methods when applied to the task of text-to-TimeML analysis. We analyse the extent to which timebank, the reference corpus for timeml, supports development of timeml-compliant analytics. The first release of the corpus exhibits challenging characteristics: small size and some noise. Nonetheless, a particular design of a time annotator trained on timebank is able to exploit the data in an implementation which deploys a hybrid analytical strategy of mixing aggressive finite-state processing over linguistic annotations with a state-of-the-art machine learning technique capable of leveraging large amounts of unannotated data. We present our design, in light of encouraging performance results; we also interpret these results in relation to a close analysis of timebank's annotation 'profile'. We conclude that even the first release of the corpus is invaluable; we further argue for more infrastructure work needed to create a larger and more robust reference corpus.1