Paraphrase acquisition for information extraction

  • Authors:
  • Yusuke Shinyama;Satoshi Sekine

  • Affiliations:
  • New York University, Broadway, NY;New York University, Broadway, NY

  • Venue:
  • PARAPHRASE '03 Proceedings of the second international workshop on Paraphrasing - Volume 16
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We are trying to find paraphrases from Japanese news articles which can be used for Information Extraction. We focused on the fact that a single event can be reported in more than one article in different ways. However, certain kinds of noun phrases such as names, dates and numbers behave as "anchors" which are unlikely to change across articles. Our key idea is to identify these anchors among comparable articles and extract portions of expressions which share the anchors. This way we can extract expressions which convey the same information. Obtained paraphrases are generalized as templates and stored for future use.In this paper, first we describe our basic idea of paraphrase acquisition. Our method is divided into roughly four steps, each of which is explained in turn. Then we illustrate several issues which we encounter in real texts. To solve these problems, we introduce two techniques: coreference resolution and structural restriction of possible portions of expressions. Finally we discuss the experimental results and conclusions.