Improved Named Entity Translation and Bilingual Named Entity Extraction

  • Authors:
  • Fei Huang

  • Affiliations:
  • -

  • Venue:
  • ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Translation of named entities (NE), including proper names, temporal and numerical expressions, is very important in multilingual natural language processing, like crosslingual information retrieval and statistical machine translation. In this paper we present an integrated approach to extract a named entity translation dictionary from a bilingual corpus while at the same time improving the named entity annotation quality.Starting from a bilingual corpus where the named entities are extracted independently for each language, a statistical alignment model is used to align the named entities. An iterative process is applied to extract named entity pairs with higher alignment probability. This leads to a smaller but cleaner named entity translation dictionary and also to a significant improvement of the monolingual named entity annotation quality for both languages. Experimental result shows that the dictionary size is reduced by 51.8% and the annotation quality is improved from70.03 to 78.15 for Chinese and 73.38 to 81.46 in terms of F-score.