Building an MT dictionary from parallel texts based on linguistic and statistical information

  • Authors:
  • Akira Kumano;Hideki Hirakawa

  • Affiliations:
  • R & D Center, Toshiba Corporation, Kawasaki, Japan;R & D Center, Toshiba Corporation, Kawasaki, Japan

  • Venue:
  • COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

A method for generating a machine translation (MT) dictionary from parallel texts is described. This method utilizes both statistical information and linguistic information to obtain corresponding words or phrases in parallel texts. By combining these two types of information, translation pairs which cannot be obtained by a linguistic-based method can be extracted. Over 70% accurate translations of compound nouns and over 50% of unknown words are obtained as the first candidate from small Japanese/English parallel texts containing severe distortions.