Scaling textual inference to the web

  • Authors:
  • Stefan Schoenmackers;Oren Etzioni;Daniel S. Weld

  • Affiliations:
  • University of Washington, Seattle, WA;University of Washington, Seattle, WA;University of Washington, Seattle, WA

  • Venue:
  • EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most Web-based Q/A systems work by finding pages that contain an explicit answer to a question. These systems are helpless if the answer has to be inferred from multiple sentences, possibly on different pages. To solve this problem, we introduce the Holmes system, which utilizes textual inference (TI) over tuples extracted from text. Whereas previous work on TI (e.g., the literature on textual entailment) has been applied to paragraph-sized texts, Holmes utilizes knowledge-based model construction to scale TI to a corpus of 117 million Web pages. Given only a few minutes, Holmes doubles recall for example queries in three disparate domains (geography, business, and nutrition). Importantly, Holmes's runtime is linear in the size of its input corpus due to a surprising property of many textual relations in the Web corpus---they are "approximately" functional in a well-defined sense.