Effective use of function words for rule generalization in forest-based translation

Authors:
Xianchao Wu;Takuya Matsuzaki;Jun'ichi Tsujii
Affiliations:
The University of Tokyo;The University of Tokyo;University of Manchester and The University of Tokyo and National Centre for Text Mining (NaCTeM)
Venue:
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Year:
2011

Citing 18
Cited 2

A systematic comparison of various statistical alignment models

Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Japanese dependency analysis using cascaded chunking

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Tree-to-string alignment template for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scalable inference and training of context-rich syntactic translation models

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Feature forest models for probabilistic hpsg parsing

Computational Linguistics
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Forest-based translation rule extraction

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Using a dependency parser to improve SMT for subject-object-verb languages

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Better k-best parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Demonstration of Joshua: an open source toolkit for parsing-based machine translation

ACLDemos '09 Proceedings of the ACL-IJCNLP 2009 Software Demonstrations
Improving tree-to-tree translation with packed forests

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Fine-grained tree-to-string translation rule extraction

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Constituency to dependency translation with forests

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Learning to translate with source and target syntax

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Re-structuring, re-labeling, and re-aligning for syntax-based machine translation

Computational Linguistics

Akamon: an open source toolkit for tree/forest-based statistical machine translation

ACL '12 Proceedings of the ACL 2012 System Demonstrations
Re-training monolingual parser bilingually for syntactic SMT

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the present paper, we propose the effective usage of function words to generate generalized translation rules for forest-based translation. Given aligned forest-string pairs, we extract composed tree-to-string translation rules that account for multiple interpretations of both aligned and unaligned target function words. In order to constrain the exhaustive attachments of function words, we limit to bind them to the nearby syntactic chunks yielded by a target dependency parser. Therefore, the proposed approach can not only capture source-tree-to-target-chunk correspondences but can also use forest structures that compactly encode an exponential number of parse trees to properly generate target function words during decoding. Extensive experiments involving large-scale English-to-Japanese translation revealed a significant improvement of 1.8 points in BLEU score, as compared with a strong forest-to-string baseline system.