Widening the field of view of information extraction through sentential event recognition

  • Authors:
  • Siddharth Patwardhan

  • Affiliations:
  • The University of Utah

  • Venue:
  • Widening the field of view of information extraction through sentential event recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Event-based Information Extraction (IE) is the task of identifying entities that play specific roles within an event described in free text. For example, given text documents containing descriptions of disease outbreak events, the goal of an IE system is to extract event role fillers, such as the disease, the victims, the location, the date, etc., of each disease outbreak described within the documents. IE systems typically rely on local clues around each phrase to identify their role within a relevant event. This research aims to improve IE performance by incorporating evidence from the wider sentential context to enable the IE model to make better decisions when faced with weak local contextual clues. To make better inferences about event role fillers, this research introduces an "event recognition" phase, which is used in combination with localized text extraction. The event recognizer operates on sentences and locates those sentences that discuss events of interest. Localized text extraction can then capitalize on this information and identify event role fillers even when the evidence in their local context is weak or inconclusive. First, this research presents PIPER, a pipelined approach for IE incorporating this idea. This model uses a classifier-based sentential event recognizer, combined with a pattern-based localized text extraction component, cascaded in a pipeline. This enables the pattern-based system to exploit sentential information for better IE coverage. Second, a unified probabilistic approach for IE, called GLACIER, is introduced to overcome limitations from the discrete nature of the pipelined model. GLACIER combines the probability of event sentences, with the probability of phrasal event role fillers into a single joint probability, which helps to better balance the influence of the two components in the IE model. An empirical evaluation of these models shows that the use of an event recognition phase improves IE performances, and it shows that incorporating such additional information through a unified probabilistic model produces the most effective IF system.