Data extraction from the web using wild card queries

  • Authors:
  • Davood Rafiei;Haobin Li

  • Affiliations:
  • University of Alberta, Edmonton, AB, Canada;University of Alberta, Edmonton, AB, Canada

  • Venue:
  • Proceedings of the 18th ACM conference on Information and knowledge management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an overview of our work for searching and retrieving facts and relationships within natural language text sources. In this work, an extraction task over a text collection is expressed as a query that combines text fragments with wild cards, and the query result is a set of facts in the form of unary, binary and general n-ary tuples. Despite being both simple and declarative, the framework can be applied to a wide range of extraction tasks. This paper presents an overview of the work and its various components. We also report some of our experiments and an evaluation of the proposed querying framework in extracting relevant information to a task.