Bilingual knowledge acquisition from Korean-English parallel corpus using alignment method: Korean-English alignment at word and phrase level

  • Authors:
  • Jung H. Shin;Young S. Han;Key-Sun Choi

  • Affiliations:
  • Korean Advanced Institute of Science and Technology, Taejon, Korea;Suwon University, Kyungki, Korea;Korean Advanced Institute of Science and Technology, Taejon, Korea

  • Venue:
  • COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper suggests a method to align Korean-English parallel corpus. The structural dissimilarity between Korean and Indo-European languages requires more flexible measures to evaluate the alignment candidates between the bilingual units than is used to handle the pairs of Indo-European languages. The flexible measure is intended to capture the dependency between bilingual items that can occur in different units according to different ordering rules. The proposed method to accomplish Korean English alignment takes phrases as an alignment unit that is a departure from the existing methods taking words as the unit. Phrasal alignment avoids the problem of alignment units and appease the problem of ordering mismatch. The parameters are estimated using the EM algorithm. The proposed alignment algorithm is based on dynamic programming. In the experiments carried out on 253,000 English words and its Korean translations the proposed method achived 68.7% in accuracy at phrase level and 89.2% in accuracy with the bilingual dictionary induced from the alignment. The result of the alignment may lead to richer bilingual data than can be derived from only word level alignments.