Application of clause alignment for statistical machine translation

Authors:
Svetla Koeva;Borislav Rizov;Ivelina Stoyanova;Svetlozara Leseva;Rositsa Dekova;Angel Genov;Ekaterina Tarpomanova;Tsvetana Dimitrova;Hristina Kukova
Affiliations:
Bulgarian Academy of Sciences, Sofia, Bulgaria;Bulgarian Academy of Sciences, Sofia, Bulgaria;Bulgarian Academy of Sciences, Sofia, Bulgaria;Bulgarian Academy of Sciences, Sofia, Bulgaria;Bulgarian Academy of Sciences, Sofia, Bulgaria;Bulgarian Academy of Sciences, Sofia, Bulgaria;Bulgarian Academy of Sciences, Sofia, Bulgaria;Bulgarian Academy of Sciences, Sofia, Bulgaria;Bulgarian Academy of Sciences, Sofia, Bulgaria
Venue:
SSST-6 '12 Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
Year:
2012

Citing 10
Cited 0

A statistical approach to machine translation

Computational Linguistics
Experiment on linguistically-based term associations

Information Processing and Management: an International Journal
A program for aligning sentences in bilingual corpora

Computational Linguistics - Special issue on using large corpora: I
Text-translation alignment

Computational Linguistics - Special issue on using large corpora: I
Char_align: a program for aligning parallel texts at the character level

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The complexity of phrase alignment problems

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
A discriminative model for tree-to-tree translation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Divide and translate: improving long distance reordering in statistical machine translation

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper presents a new resource light flexible method for clause alignment which combines the Gale-Church algorithm with internally collected textual information. The method does not resort to any pre-developed linguistic resources which makes it very appropriate for resource light clause alignment. We experiment with a combination of the method with the original Gale-Church algorithm (1993) applied for clause alignment. The performance of this flexible method, as it will be referred to hereafter, is measured over a specially designed test corpus. The clause alignment is explored as means to provide improved training data for the purposes of Statistical Machine Translation (SMT). A series of experiments with Moses demonstrate ways to modify the parallel resource and effects on translation quality: (1) baseline training with a Bulgarian-English parallel corpus aligned at sentence level; (2) training based on parallel clause pairs; (3) training with clause reordering, where clauses in each source language (SL) sentence are reordered according to order of the clauses in the target language (TL) sentence. Evaluation is based on BLEU score and shows small improvement when using the clause aligned corpus.