Collocation Dictionary Optimization Using WordNetand k-Nearest Neighbor Learning

  • Authors:
  • Yuseop Kim;Byoung-Tak Zhang;Yung Taek Kim

  • Affiliations:
  • Department of Computer Engineering, Seoul National University, Seoul 151-742, Korea E-mail: yskim@nova.snu.ac.kr;Department of Computer Engineering, Seoul National University, Seoul 151-742, Korea E-mail: btzhang@comp.snu.ac.kr;Department of Computer Engineering, Seoul National University, Seoul 151-742, Korea E-mail: ytkim@comp.snu.ac.kr

  • Venue:
  • Machine Translation
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In machine translation, collocation dictionaries are important for selecting accurate target words. However, if the dictionary size is too large it can decrease the efficiency of translation. This paper presents a method to develop a compact collocation dictionary for transitive verb–object pairs in English–Korean machine translation without losing translation accuracy. We use WordNet to calculate the semantic distance between words, and k-nearestneighbor learning to select the translations. The entries in the dictionary are minimized to balance the trade-off between translation accuracy and time. We have performed several experiments on a selected set of verbs extracted from a raw corpus of over 3 million words. The results show that in real-time translation environments the size of a collocation dictionary can be reduced up to 40% of its original size without significant decrease in its accuracy.