Information extraction from unstructured web text

Authors:
Oren Etzioni;Ana-Maria Popescu
Affiliations:
University of Washington;University of Washington
Venue:
Information extraction from unstructured web text
Year:
2007

Citing 0
Cited 4

Open information extraction from the web

Communications of the ACM - Surviving the data deluge
Unsupervised methods for determining object and relation synonyms on the web

Journal of Artificial Intelligence Research
Quantifier scope disambiguation using extracted pragmatic knowledge: preliminary results

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Identifying functional relations in web text

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the past few years the World Wide Web has emerged as an important source of data, much of it in the form of unstructured text. This thesis describes an extensible model for information extraction that takes advantage of the unique characteristics of Web text and leverages existent search engine technology in order to ensure the quality of the extracted information. The key features of our approach are the use of lexico-syntactic patterns, Web-scale statistics and unsupervised or semi-supervised learning methods. Our information extraction model has been instantiated and extended in order to solve a set of diverse information extraction tasks: subclass and related class extraction, relation property learning, the acquisition of salient product features and corresponding user opinions from customer reviews and finally, the mining of commonsense information from the Web for the benefit of integrated AI systems.