A Self-Learning Method of Parallel Texts Alignment

Authors:
António Ribeiro;José Gabriel Pereira Lopes;João Mexia
Affiliations:
-;-;-
Venue:
AMTA '00 Proceedings of the 4th Conference of the Association for Machine Translation in the Americas on Envisioning Machine Translation in the Information Future
Year:
2000

Citing 10
Cited 0

A Technical Word- and Term-Translation Aid Using Noisy Parallel Corpora across Language Groups

Machine Translation
Bilingual Sentence Alignment: Balancing Robustness and Accuracy

Machine Translation
Using LocalMaxs Algorithm for the Extraction of Contiguous and Non-contiguous Multiword Lexical Units

EPIA '99 Proceedings of the 9th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Text-translation alignment

Computational Linguistics - Special issue on using large corpora: I
Bitext maps and alignment via pattern recognition

Computational Linguistics
Aligning sentences in parallel corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Char_align: a program for aligning parallel texts at the character level

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning a parallel English-Chinese corpus statistically with lexical criteria

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Using confidence bands for parallel texts alignment

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a language independent method for alignment of parallel texts that re-uses acquired knowledge. The system extracts word translation equivalents and re-uses them as correspondence points in order to enhance the alignment of parallel texts. Points that may cause misalignment are filtered using confidence bands of linear regression analysis instead of heuristics, which are not theoretically reliable. Homographs bootstrap the alignment process so as to build the primary word translation lexicon. At each step, the previously acquired lexicon is re-used so as to repeatedly make. finer-grained alignments and produce more reliable translation lexicons.