Chinese syntactic reordering for adequate generation of Korean verbal phrases in Chinese-to-Korean SMT

Authors:
Jin-Ji Li;Jungi Kim;Dong-Il Kim;Jong-Hyeok Lee
Affiliations:
Pohang University of Science and Technology (POSTECH), Pohang, R. of Korea;Pohang University of Science and Technology (POSTECH), Pohang, R. of Korea;Yanbian University of Science and Technology (YUST), Jilin, P.R. of China;Pohang University of Science and Technology (POSTECH), Pohang, R. of Korea
Venue:
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Year:
2009

Citing 6
Cited 1

The Penn Chinese TreeBank: Phrase structure annotation of a large corpus

Natural Language Engineering
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Is it harder to parse Chinese, or the Chinese Treebank?

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Clause restructuring for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Improving a statistical MT system with automatically learned rewrite patterns

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions

Pre- and postprocessing for statistical machine translation into Germanic languages

HLT-SS '11 Proceedings of the ACL 2011 Student Session

Quantified Score

Hi-index	0.00

Visualization

Abstract

Chinese and Korean belong to different language families in terms of word-order and morphological typology. Chinese is an SVO and morphologically poor language while Korean is an SOV and morphologically rich one. In Chinese-to-Korean SMT systems, systematic differences between the verbal systems of the two languages make the generation of Korean verbal phrases difficult. To resolve the difficulties, we address two issues in this paper. The first issue is that the verb position is different from the viewpoint of word-order typology. The second is the difficulty of complex morphology generation of Korean verbs from the viewpoint of morphological typology. We propose a Chinese syntactic reordering that is better at generating Korean verbal phrases in Chinese-to-Korean SMT. Specifically, we consider reordering rules targeting Chinese verb phrases (VPs), preposition phrases (PPs), and modality-bearing words that are closely related to Korean verbal phrases. We verify our system with two corpora of different domains. Our proposed approach significantly improves the performance of our system over a baseline phrased-based SMT system. The relative improvements in the two corpora are +9.32% and +5.43%, respectively.