Bridging morpho-syntactic gap between source and target sentences for English-Korean statistical machine translation

Authors:
Gumwon Hong;Seung-Wook Lee;Hae-Chang Rim
Affiliations:
Korea University, Seoul, Korea;Korea University, Seoul, Korea;Korea University, Seoul, Korea
Venue:
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Year:
2009

Citing 5
Cited 5

A systematic comparison of various statistical alignment models

Computational Linguistics
Part-of-speech tagging considering surface form for an agglutinative language

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Clause restructuring for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Improving a statistical MT system with automatically learned rewrite patterns

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions

Head finalization: a simple reordering rule for SOV languages

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Effects of empty categories on machine translation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
HPSG-Based Preprocessing for English-to-Japanese Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Normalizing Complex Functional Expressions in Japanese Predicates: Linguistically-Directed Rule-Based Paraphrasing and Its Application

ACM Transactions on Asian Language Information Processing (TALIP)
Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Often, Statistical Machine Translation (SMT) between English and Korean suffers from null alignment. Previous studies have attempted to resolve this problem by removing unnecessary function words, or by reordering source sentences. However, the removal of function words can cause a serious loss in information. In this paper, we present a possible method of bridging the morpho-syntactic gap for English-Korean SMT. In particular, the proposed method tries to transform a source sentence by inserting pseudo words, and by reordering the sentence in such a way that both sentences have a similar length and word order. The proposed method achieves 2.4 increase in BLEU score over baseline phrase-based system.