Named entity translation matching and learning: With application for mining unseen translations

  • Authors:
  • Wai Lam;Shing-Kit Chan;Ruizhang Huang

  • Affiliations:
  • The Chinese University of Hong Kong, Shatin, Hong Kong;The Chinese University of Hong Kong, Shatin, Hong Kong;The Chinese University of Hong Kong, Shatin, Hong Kong

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article introduces a named entity matching model that makes use of both semantic and phonetic evidence. The matching of semantic and phonetic information is captured by a unified framework via a bipartite graph model. By considering various technical challenges of the problem, including order insensitivity and partial matching, this approach is less rigid than existing approaches and highly robust. One major component is a phonetic matching model which exploits similarity at the phoneme level. Two learning algorithms for learning the similarity information of basic phonemic matching units based on training examples are investigated. By applying the proposed named entity matching model, a mining system is developed for discovering new named entity translations from daily Web news. The system is able to discover new name translations that cannot be found in the existing bilingual dictionary.