Extracting sequences from the web

  • Authors:
  • Anthony Fader;Stephen Soderland;Oren Etzioni

  • Affiliations:
  • University of Washington, Seattle;University of Washington, Seattle;University of Washington, Seattle

  • Venue:
  • ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Classical Information Extraction (IE) systems fill slots in domain-specific frames. This paper reports on SEQ, a novel open IE system that leverages a domain-independent frame to extract ordered sequences such as presidents of the United States or the most common causes of death in the U.S. SEQ leverages regularities about sequences to extract a coherent set of sequences from Web text. SEQ nearly doubles the area under the precision-recall curve compared to an extractor that does not exploit these regularities.