Mining script-like structures from the web

Authors:
Niels Kasch;Tim Oates
Affiliations:
University of Maryland, Baltimore County, Baltimore, MD;University of Maryland, Baltimore County, Baltimore, MD
Venue:
FAM-LbR '10 Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading
Year:
2010

Citing 4
Cited 3

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
DIRT @SBT@discovery of inference rules from text

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Discovery of inference rules for question-answering

Natural Language Engineering
Automatic acquisition of script knowledge from a text collection

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2

Automatically producing plot unit representations for narrative text

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Template-based information extraction without the templates

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Towards the unsupervised acquisition of discourse relations

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents preliminary work to extract script-like structures, called events and event sets, from collections of web documents. Our approach, contrary to existing methods, is topic-driven in the sense that event sets are extracted for a specified topic. We introduce an iterative system architecture and present methods to reduce noise problems with web corpora. Preliminary results show that LSA-based event relatedness yields better event sets from web corpora than previous methods.