An intelligent terminology database as a pre-processor for statistical machine translation

  • Authors:
  • Michael Carl;Philippe Langlais

  • Affiliations:
  • Université de Montréal, Montréal, Québec, Canada;Université de Montréal, Montréal, Québec, Canada

  • Venue:
  • COMPUTERM '02 COLING-02 on COMPUTERM 2002: second international workshop on computational terminology - Volume 14
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a recent study Langlais (Langlais, 2002) has shown that the output of a Statistical Machine Translation (SMT) system deteriorates significantly the more the new text differs from the text the system has been trained on. Langlais shows that bilingual terminological databases are resources that can be taken into account to boost the performance of the statistical engine. This paper extends the notion of 'terminological databases' to an Intelligent Terminological Database (ITDB) capable to detect and reduce terms and their variants and to re-generate the authorized target language terms. The paper discusses the aims and the architecture of the ITDB and evaluates its integration with a SMT system.