EXTRA: a system for example-based translation assistance

Authors:
Federica Mandreoli;Riccardo Martoglia;Paolo Tiberio
Affiliations:
Dip. di Ingegneria dell'Informazione, Università di Modena e Reggio Emilia, Modena, Italy;Dip. di Ingegneria dell'Informazione, Università di Modena e Reggio Emilia, Modena, Italy;Dip. di Ingegneria dell'Informazione, Università di Modena e Reggio Emilia, Modena, Italy
Venue:
Machine Translation
Year:
2006

Citing 32
Cited 0

A framework of a mechanical translation between Japanese and English by analogy principle

Proc. of the international NATO symposium on Artificial and human intelligence
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Very fast and simple approximate string matching

Information Processing Letters
Evaluation metrics for a translation memory system

Software—Practice & Experience
A guided tour to approximate string matching

ACM Computing Surveys (CSUR)
Modern Information Retrieval

Modern Information Retrieval
A syntactic approach for searching similarities within sentences

Proceedings of the eleventh international conference on Information and knowledge management
Review Article: Example-based Machine Translation

Machine Translation
New and faster filters for multiple approximate string matching

Random Structures & Algorithms
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Variable Length Queries for Time Series Data

Proceedings of the 17th International Conference on Data Engineering
Approximate String Joins in a Database (Almost) for Free

Proceedings of the 27th International Conference on Very Large Data Bases
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Filtration with q-Samples in Approximate String Matching

CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
On Using q-Gram Locations in Approximate String Matching

ESA '95 Proceedings of the Third Annual European Symposium on Algorithms
A Metric Index for Approximate String Matching

LATIN '02 Proceedings of the 5th Latin American Symposium on Theoretical Informatics
Adaptation Guided Retrieval in EBMT: A Case-Based Approach to Machine Translation

EWCBR '96 Proceedings of the Third European Workshop on Advances in Case-Based Reasoning
Approximate String Matching in DNA Sequences

DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
A Fast Algorithm on Average for All-Against-All Sequence Matching

SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Efficient Time Series Matching by Wavelets

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Experiments and prospects of Example-Based Machine Translation

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
The effects of word order and segmentation on translation retrieval performance

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Toward memory-based translation

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
A matching technique in Example-Based Machine Translation

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Example-Based Machine Translation in the Pangloss system

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Multi-level similar segment matching algorithm for translation memories and Example-based Machine Translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Adaptation using out-of-domain corpus within EBMT

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
Fast Approximate Search in Large Dictionaries

Computational Linguistics
Statistical machine translation with word- and sentence-aligned parallel corpora

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Meteor: an automatic metric for MT evaluation with high levels of correlation with human judgments

StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present EXTRA (EXample-based TRanslation Assistant), a translation memory (TM) system. EXTRA is able to propose effective translation suggestions by relying on syntactic analysis of the text and on a rigorous, language-independent measure; the search is performed efficiently in large amounts of bilingual texts thanks to its advanced retrieval techniques. EXTRA does not use external knowledge requiring the intervention of users and is completely customizable and portable as it has been implemented on top of a standard DataBase Management System. The paper provides a thorough evaluation of both the effectiveness and the efficiency of our system. In particular, in order to quantify the benefits offered by EXTRA assisted translation over manual translation, we introduce a simulator implementing specifically devised statistical, process-oriented, discrete-event models. As far as we know, this is the first time statistical simulation experiments have been used to face the nontrivial problem of evaluating TM systems, particularly for comparing the time that could be saved by performing assisted translation versus "manual" translation and for optimally tuning the system behaviour with respect to differently skilled users. In our experiments, we considered three scenarios, manual translation with one or two translators and assisted translation with one translator. The time needed for one translator to do an assisted translation is significantly closer to that of a team of two translators than to that of the single translator. The mean sentence translation time is by far the lowest for this scenario, corresponding to the highest per translator productivity. We also estimate the total translation time when the number of query sentences, the maximum number of suggestions to be read, and the probability of look up are varied: the best trade-off is given by reading (and presenting) four or five suggestions at the most.