Mining Translations of Chinese Names from Web Corpora Using a Query Expansion Technique and Support Vector Machine

  • Authors:
  • Kai-Hsiang Yang;Wei-Da Chen;Hahn-Ming Lee;Jan-Ming Ho

  • Affiliations:
  • -;-;-;-

  • Venue:
  • WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Chinese name translation is a special case of theproblem of named entity translation. It is a verychallenging problem because there exist many kinds ofRomanization systems and some people like to addadditional words into their English names. Translatinga scholar's name to its corresponding English namecould help find information about his academicachievements. In this paper, we provide a classificationfor Chinese names, and propose a novel approach tomining Chinese name translations from Web corpora.Our approach is based on three kinds of features,namely the phonetic similarity, the smallest distance,and the number of appearances in the neighborhood,to extract name translation candidates by using aquery expansion technique and Support VectorMachine (SVM). Experimental results show that ourapproach can correctly translate the majority ofChinese names.