Bilingual word spectral clustering for statistical machine translation

  • Authors:
  • Bing Zhao;Eric P. Xing;Alex Waibel

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, Pennsylvania;Carnegie Mellon University, Pittsburgh, Pennsylvania;Carnegie Mellon University, Pittsburgh, Pennsylvania

  • Venue:
  • ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, a variant of a spectral clustering algorithm is proposed for bilingual word clustering. The proposed algorithm generates the two sets of clusters for both languages efficiently with high semantic correlation within monolingual clusters, and high translation quality across the clusters between two languages. Each cluster level translation is considered as a bilingual concept, which generalizes words in bilingual clusters. This scheme improves the robustness for statistical machine translation models. Two HMM-based translation models are tested to use these bilingual clusters. Improved perplexity, word alignment accuracy, and translation quality are observed in our experiments.