Building a Graph of Names and Contextual Patterns for Named Entity Classification

  • Authors:
  • César Pablo-Sánchez;Paloma Martínez

  • Affiliations:
  • Computer Science Department, Universidad Carlos III de Madrid, Leganés, Spain 28911;Computer Science Department, Universidad Carlos III de Madrid, Leganés, Spain 28911

  • Venue:
  • ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

An algorithm that bootstraps the acquisition of large dictionaries of entity types (names) and pattern types from a few seeds and a large unannotated corpora is presented. The algorithm iteratively builds a bigraph of entities and collocated patterns by querying the text. Several classes simultaneously compete to label the entity types. Different experiments have been carried to acquire resources from a 1GB corpus of Spanish news. The usefulness of the acquired list of entity types for the task of Name Classification has also been evaluated with good results for a weakly supervised method.