Statistical machine translation enhancements through linguistic levels: A survey

  • Authors:
  • Marta R. Costa-Jussà;Mireia Farrús

  • Affiliations:
  • Institute for Infocomm Research, Singapore;Universitat Pompeu Fabra, Barcelona

  • Venue:
  • ACM Computing Surveys (CSUR)
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Machine translation can be considered a highly interdisciplinary and multidisciplinary field because it is approached from the point of view of human translators, engineers, computer scientists, mathematicians, and linguists. One of the most popular approaches is the Statistical Machine Translation (smt) approach, which tries to cover translation in a holistic manner by learning from parallel corpus aligned at the sentence level. However, with this basic approach, there are some issues at each written linguistic level (i.e., orthographic, morphological, lexical, syntactic and semantic) that remain unsolved. Research in smt has continuously been focused on solving the different linguistic levels challenges. This article represents a survey of how the smt has been enhanced to perform translation correctly at all linguistic levels.