Using confidence bands for parallel texts alignment

Authors:
António Ribeiro;Gabriel Lopes;João Mexia
Affiliations:
Universidade Nova de Lisboa, Quinta da Torre, Portugal;Universidade Nova de Lisboa, Quinta da Torre, Portugal;Universidade Nova de Lisboa, Quinta da Torre, Portugal
Venue:
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Year:
2000

Citing 9
Cited 5

A Technical Word- and Term-Translation Aid Using Noisy Parallel Corpora across Language Groups

Machine Translation
Bilingual Sentence Alignment: Balancing Robustness and Accuracy

Machine Translation
Text-translation alignment

Computational Linguistics - Special issue on using large corpora: I
Bitext maps and alignment via pattern recognition

Computational Linguistics
Aligning sentences in parallel corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Char_align: a program for aligning parallel texts at the character level

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning a parallel English-Chinese corpus statistically with lexical criteria

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Aligning portuguese and chinese parallel texts using confidence bands

PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence

Extracting Equivalents from Aligned Parallel Texts: Comparison of Measures of Similarity

IBERAMIA-SBIA '00 Proceedings of the International Joint Conference, 7th Ibero-American Conference on AI: Advances in Artificial Intelligence
A Self-Learning Method of Parallel Texts Alignment

AMTA '00 Proceedings of the 4th Conference of the Association for Machine Translation in the Americas on Envisioning Machine Translation in the Information Future
Knowledge intensive word alignment with KNOWA

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Longest sorted sequence algorithm for parallel text alignment

EUROCAST'05 Proceedings of the 10th international conference on Computer Aided Systems Theory
Using natural alignment to extract translation equivalents

PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a language independent method for alignment of parallel texts that makes use of homograph tokens for each pair of languages. In order to filter out tokens that may cause misalignment, we use confidence bands of linear regression lines instead of heuristics which are not theoretically supported. This method was originally inspired on work done by Pascale Fung and Kathleen McKeown, and Melamed, providing the statistical support those authors could not claim.