Mining name translations from comparable corpora by creating bilingual information networks

  • Authors:
  • Heng Ji

  • Affiliations:
  • The City University of New York, New York, NY

  • Venue:
  • BUCC '09 Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper describes a new task to extract and align information networks from comparable corpora. As a case study we demonstrate the effectiveness of this task on automatically mining name translation pairs. Starting from a small set of seeds, we design a novel approach to acquire name translation pairs in a bootstrapping framework. The experimental results show this approach can generate highly accurate name translation pairs for persons, geopolitical and organization entities.