Unsupervised translation disambiguation based on maximum web bilingual relatedness: web as lexicon

  • Authors:
  • PengYuan Liu;TieJun Zhao

  • Affiliations:
  • Institute of Computing Linguistic, Peking University, Beijing, China;Department of Computer Science and Engineering, Harbin Institute of Technology, Harbin, China

  • Venue:
  • FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper regards Web as a semantic lexicon and alleviates the problem of bilingual lexical knowledge acquiring. Based on mix-language web page counts, four Web Bilingual Relatedness (WBR) measurements are built. WBR measurements are evaluated by a modified Miller-Charles' dataset and it is found that the measurement based on point-wise mutual information achieves the best performance. Furthermore, this paper presents a fully unsupervised translation disambiguation method which selects the translation to maximize the sum of WBR between translation and all context words. By testing this disambiguation method on Multilingual Chinese English Lexical Sample Task in SemEval-2007, it is found that the WBR disambiguation model based on point-wise mutual information achieves the best performance, outperforms other previous work and gets the state-of-the-art results (Pmar=0.451)