Exploiting structure for event discovery using the MDI algorithm

  • Authors:
  • Martina Naughton

  • Affiliations:
  • University College Dublin, Ireland

  • Venue:
  • ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Effectively identifying events in unstructured text is a very difficult task. This is largely due to the fact that an individual event can be expressed by several sentences. In this paper, we investigate the use of clustering methods for the task of grouping the text spans in a news article that refer to the same event. The key idea is to cluster the sentences, using a novel distance metric that exploits regularities in the sequential structure of events within a document. When this approach is compared to a simple bag of words baseline, a statistically significant increase in performance is observed.