Translation spotting for translation memories

Authors:
Michel Simard
Affiliations:
Université de Montréal, Montréal (Québec), Canada
Venue:
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Year:
2003

Citing 7
Cited 5

Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
Bricks and Skeletons: Some Ideas for the Near Future of MAHT

Machine Translation
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Example-Based Machine Translation in the Pangloss system

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Towards a unified approach to memory- and statistical-based machine translation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Shallow parsing as part-of-speech tagging

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7

Enhancing the Bilingual Concordancer TransSearch with Word-Level Alignment

Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
TransSearch: from a bilingual concordancer to a translation finder

Machine Translation
Generalized biwords for bitext compression and translation spotting

Journal of Artificial Intelligence Research
Combining EBMT, SMT, TM and IR technologies for quality and scale

EACL 2012 Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)
Generalized biwords for bitext compression and translation spotting: extended abstract

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The term translation spotting (TS) refers to the task of identifying the target-language (TL) words that correspond to a given set of source-language (SL) words in a pair of text segments known to be mutual translations. This article examines this task within the context of a sub-sentential translation-memory system, i.e. a translation support tool capable of proposing translations for portions of a SL sentence, extracted from an archive of existing translations. Different methods are proposed, based on a statistical translation model. These methods take advantage of certain characteristics of the application, to produce TL segments submitted to constraints of contiguity and compositionality. Experiments show that imposing these constraints allows important gains in accuracy, with regard to the most probable alignments predicted by the model.