Mining script-like structures from the web

  • Authors:
  • Niels Kasch;Tim Oates

  • Affiliations:
  • University of Maryland, Baltimore County, Baltimore, MD;University of Maryland, Baltimore County, Baltimore, MD

  • Venue:
  • FAM-LbR '10 Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents preliminary work to extract script-like structures, called events and event sets, from collections of web documents. Our approach, contrary to existing methods, is topic-driven in the sense that event sets are extracted for a specified topic. We introduce an iterative system architecture and present methods to reduce noise problems with web corpora. Preliminary results show that LSA-based event relatedness yields better event sets from web corpora than previous methods.