A rote extractor with edit distance-based generalisation and multi-corpora precision calculation

  • Authors:
  • Enrique Alfonseca;Pablo Castells;Manabu Okumura;Maria Ruiz-Casado

  • Affiliations:
  • Univ. Autónoma de Madrid and Tokyo Institute of Technology;Univ. Autónoma de Madrid;Tokyo Institute of Technology;Univ. Autónoma de Madrid and Tokyo Institute of Technology

  • Venue:
  • COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe a rote extractor that learns patterns for finding semantic relationships in unrestricted text, with new procedures for pattern generalization and scoring. These include the use of part-of-speech tags to guide the generalization, Named Entity categories inside the patterns, an edit-distance-based pattern generalization algorithm, and a pattern accuracy calculation procedure based on evaluating the patterns on several test corpora. In an evaluation with 14 entities, the system attains a precision higher than 50% for half of the relationships considered.