Bilingual Sentence Alignment: Balancing Robustness and Accuracy
Machine Translation
A program for aligning sentences in bilingual corpora
Computational Linguistics - Special issue on using large corpora: I
Computational Linguistics - Special issue on using large corpora: I
Bitext maps and alignment via pattern recognition
Computational Linguistics
Aligning sentences in parallel corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Char_align: a program for aligning parallel texts at the character level
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning a parallel English-Chinese corpus statistically with lexical criteria
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Extracting Equivalents from Aligned Parallel Texts: Comparison of Measures of Similarity
IBERAMIA-SBIA '00 Proceedings of the International Joint Conference, 7th Ibero-American Conference on AI: Advances in Artificial Intelligence
Using confidence bands for parallel texts alignment
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Hi-index | 0.00 |
This paper describes a language independent method that makes use of tokens which are homograph for a pair of languages, in order to align parallel texts. We will show that even for such different languages as Portuguese and Chinese it is possible to use homographs with great reliability. This work was originally inspired and extends work done by Pascale Fung & Kathleen McKeown, and Melamed. In order to filter out words that may cause misalignment, we use confidence bands of linear regression lines instead of statistically unsupported heuristics. This is a completely statistically supported alignment algorithm.