Bilingual Sentence Alignment: Balancing Robustness and Accuracy

Authors:
Machine Translation staff
Affiliations:
-
Venue:
Machine Translation
Year:
1998

Citing 8
Cited 14

Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing

Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing
Adaptive Sentence Boundary Disambiguation

Adaptive Sentence Boundary Disambiguation
Text-translation alignment

Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Aligning sentences in parallel corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Char_align: a program for aligning parallel texts at the character level

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning sentences in bilingual corpora using lexical information

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics

Extracting Equivalents from Aligned Parallel Texts: Comparison of Measures of Similarity

IBERAMIA-SBIA '00 Proceedings of the International Joint Conference, 7th Ibero-American Conference on AI: Advances in Artificial Intelligence
A Self-Learning Method of Parallel Texts Alignment

AMTA '00 Proceedings of the 4th Conference of the Association for Machine Translation in the Americas on Envisioning Machine Translation in the Information Future
Fast and Accurate Sentence Alignment of Bilingual Corpora

AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
Building parallel corpora by automatic title alignment using length-based and text-based approaches

Information Processing and Management: an International Journal
Using confidence bands for parallel texts alignment

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Knowledge intensive word alignment with KNOWA

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Translating medical terminologies through word alignment in parallel text corpora

Journal of Biomedical Informatics
Comparison, selection and use of sentence alignment algorithms for new language pairs

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Chinese-Uyghur sentence alignment: an approach based on anchor sentences

BUCC '09 Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
Aligning portuguese and chinese parallel texts using confidence bands

PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Improved unsupervised sentence alignment for symmetrical and asymmetrical parallel corpora

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
GREAT: open source software for statistical machine translation

Machine Translation
Automatic segmentation of bilingual corpora: a comparison of different techniques

IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
Using natural alignment to extract translation equivalents

PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sentence alignment is the problem of making explicit the relationsthat exist between the sentences of two texts that are known to be mutualtranslations. Automatic sentence-alignment methods typically face two kindsof difficulties. First, there is the question of robustness. In real life,discrepancies between a source text and its translation are quite common:differences in layout, omissions, inversions, etc. Sentence-alignmentprograms must be ready to deal with such phenomena. Then, there is thequestion of accuracy. Even when translations are ’’clean‘‘, alignment isstill not a trivial matter: some decisions are hard to make, even forhumans. We report here on the current state of our ongoing efforts toproduce a sentence-alignment program that is both robust and accurate. Themethod that we propose relies on two new alignment engines: one thatproduces highly reliable and robust character-level alignments, and one thatrelies on statistical lexical knowledge to produce accurate mappings.Experimental results are presented which demonstrate the method‘seffectiveness, and highlight where problems remain to be solved.