Forest-based translation rule extraction

Authors:
Haitao Mi;Liang Huang
Affiliations:
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;University of Pennsylvania, Philadelphia, PA and Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Venue:
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Year:
2008

Citing 13
Cited 49

An efficient context-free parsing algorithm

Communications of the ACM
The structure of shared forests in ambiguous parsing

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A hierarchical phrase-based model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Dependency treelet translation: syntactically informed phrasal SMT

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Clause restructuring for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Machine translation using probabilistic synchronous dependency insertion grammars

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Tree-to-string alignment template for statistical machine translation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Scalable inference and training of context-rich syntactic translation models

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Forest-based translation rule extraction

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Better k-best parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Parsing the penn chinese treebank with semantic knowledge

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Forest-based translation rule extraction

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Lattice Minimum Bayes-Risk decoding for statistical machine translation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Sub-sentence division for tree-based machine translation

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Efficient Minimum Error Rate Training and Minimum Bayes-Risk decoding for translation hypergraphs and lattices

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Forest-based tree sequence to string translation model

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Improving tree-to-tree translation with packed forests

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Fast consensus decoding over translation forests

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Joint decoding with multiple translation models

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Source-language entailment modeling for translating unknown terms

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Feature-rich translation by quasi-synchronous lattice parsing

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Tree kernel-based SVM with structured syntactic knowledge for BTG-based phrase reordering

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Fast translation rule matching for syntax-based statistical machine translation

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
A compact forest for scalable inference over entailment and paraphrase rules

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Bilingually-constrained (monolingual) shift-reduce parsing

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Fine-grained tree-to-string translation rule extraction

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Convolution kernel over packed parse forest

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Constituency to dependency translation with forests

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Improved translation with source syntax labels

WMT '10 Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR
Soft syntactic constraints for hierarchical phrase-based translation using latent syntactic distributions

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Non-isomorphic forest pair translation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Mixture model-based minimum Bayes risk decoding using multiple machine translation systems

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Joint parsing and translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Dependency forest for statistical machine translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Joint tokenization and translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Improve syntax-based translation using deep syntactic structures

Machine Translation
Head-modifier relation based non-lexical reordering model for phrase-based translation

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Machine translation with lattices and forests

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
An empirical study of translation rule extraction with multiple parsers

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Re-structuring, re-labeling, and re-aligning for syntax-based machine translation

Computational Linguistics
A word-class approach to labeling PSCFG rules for machine translation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Effective use of function words for rule generalization in forest-based translation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A comparison of loopy belief propagation and dual decomposition for integrated CCG supertagging and parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Incremental syntactic language models for phrase-based translation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Binarized forest to string translation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Learning to transform and select elementary trees for improved syntax-based machine translations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Adjoining tree-to-string translation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Improving decoding generalization for tree-to-string translation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Multi-word unit dependency forest-based translation rule extraction

SSST-5 Proceedings of the Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation
Language Modeling for Syntax-Based Machine Translation Using Tree Substitution Grammars: A Case Study on Chinese-English Translation

ACM Transactions on Asian Language Information Processing (TALIP)
MSR-NLP entry in BioNLP Shared Task 2011

BioNLP Shared Task '11 Proceedings of the BioNLP Shared Task 2011 Workshop
ETS: an error tolerable system for coreference resolution

CONLL Shared Task '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task
Extraction programs: a unified approach to translation rule extraction

WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Third-order variational reranking on packed-shared dependency forests

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Akamon: an open source toolkit for tree/forest-based statistical machine translation

ACL '12 Proceedings of the ACL 2012 System Demonstrations
An exploration of forest-to-string translation: does translation help or hurt parsing?

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Forced derivation tree based model training to statistical machine translation

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Transforming trees to improve syntactic convergence

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Unsupervised sub-tree alignment for tree-to-tree translation

Journal of Artificial Intelligence Research
Maximum-entropy word alignment and posterior-based phrase extraction for machine translation

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Translation rule extraction is a fundamental problem in machine translation, especially for linguistically syntax-based systems that need parse trees from either or both sides of the bi-text. The current dominant practice only uses 1-best trees, which adversely affects the rule set quality due to parsing errors. So we propose a novel approach which extracts rules from a packed forest that compactly encodes exponentially many parses. Experiments show that this method improves translation quality by over 1 BLEU point on a state-of-the-art tree-to-string system, and is 0.5 points better than (and twice as fast as) extracting on 30-best parses. When combined with our previous work on forest-based decoding, it achieves a 2.5 BLEU points improvement over the base-line, and even outperforms the hierarchical system of Hiero by 0.7 points.