Names and similarities on the web: fact extraction in the fast lane

  • Authors:
  • Marius Paşca;Dekang Lin;Jeffrey Bigham;Andrei Lifchits;Alpa Jain

  • Affiliations:
  • Google Inc., Mountain View, CA;Google Inc., Mountain View, CA;University of Washington, Seattle, WA;University of British Columbia, Vancouver, BC;Columbia University, New York, NY

  • Venue:
  • ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a new approach to large-scale extraction of facts from unstructured text, distributional similarities become an integral part of both the iterative acquisition of high-coverage contextual extraction patterns, and the validation and ranking of candidate facts. The evaluation measures the quality and coverage of facts extracted from one hundred million Web documents, starting from ten seed facts and using no additional knowledge, lexicons or complex tools.