Using cognates to align sentences in bilingual corpora
CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
A program for aligning sentences in bilingual corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Char_align: a program for aligning parallel texts at the character level
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning sentences in bilingual corpora using lexical information
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning a parallel English-Chinese corpus statistically with lexical criteria
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
QUILT: implementing a large-scale cross-language text retrieval system
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Glossary-Based MT Engines in a Multilingual Analyst‘s Workstation Architecture
Machine Translation
Semi-automatic acquisition of domain-specific translation lexicons
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Advances in multilingual text retrieval
TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996
Statistical machine translation of texts with misspelled words
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
Alignment methods based on byte-length comparisons of alignment blocks have been remarkably successful for aligning good translations from legislative transcriptions. For noisy translations in which the parallel text of a document has significant structural differences, byte-alignment methods often do not perform well. The Pan American Health Organization (PAHO) corpus is a series of articles that were first translated by machine methods and then improved by professional translators. Many of the Spanish PAHO texts do not share formatting conventions with the corresponding English documents, refer to tables in stylistically different ways and contain extraneous information. A method based on a dynamic programming framework, but using a decision criterion derived from a combination of byte-length ratio measures, hard matching of numbers, string comparisons and n-gram co-occurrence matching substantially improves the performance of the alignment process.