Information extraction with term frequencies

  • Authors:
  • T. R. Lynam;C. L. A. Clarke;G. V. Cormack

  • Affiliations:
  • University of Waterloo, Ontario, Canada;University of Waterloo, Ontario, Canada;University of Waterloo, Ontario, Canada

  • Venue:
  • HLT '01 Proceedings of the first international conference on Human language technology research
  • Year:
  • 2001

Quantified Score

Hi-index 0.01

Visualization

Abstract

Every day, millions of people use the internet to answer questions. Unfortunately, at present, there is no simple and successful means to consistently accomplish this goal. One common approach is to enter a few terms from a question into a Web search system and scan the resulting pages for the answer, a laborious process. To address this need, a question answering (QA) system was created to find and extract answers from a corpus. This system contains three parts: a parser for generating question queries and categories, a passage retrieval element, and an information extraction (IE) component. The extraction method was designed to elicit answers from passages collected by the information retrieval engine. The subject of this paper is the information extraction component. It is based on the premise that information related to the answer will be found many times in a large corpus like the Web.