Learning phonetic similarity for matching named entity translations and mining new translations

  • Authors:
  • Wai Lam;Ruizhang Huang;Pik-Shan Cheung

  • Affiliations:
  • The Chinese University of Hong Kong, Shatin, Hong Kong;The Chinese University of Hong Kong, Shatin, Hong Kong;The Chinese University of Hong Kong, Shatin, Hong Kong

  • Venue:
  • Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a novel named entity matching model which considers both semantic and phonetic clues. The matching is formulated as an optimization problem. One major component is a phonetic matching model which exploits similarity at the phoneme level. We investigate three learning algorithms for obtaining the similarity information of basic phoneme units based on training examples. By applying this proposed named entity matching model, we also develop a mining framework for discovering new, unseen named entity translations from online daily Web news. This framework harvests comparable news in different languages using an existing bilingual dictionary. It is able to discover new name translations not found in the dictionary.