Empirical lower bounds on translation unit error rate for the full class of inversion transduction grammars

Authors:
Anders Søgaard;Dekai Wu
Affiliations:
University of Copenhagen;Hong Kong Univ. of Science and Technology
Venue:
IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Year:
2009

Citing 17
Cited 4

The theory of parsing, translation, and compiling

The theory of parsing, translation, and compiling
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
A comparison of alignment models for statistical machine translation

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A comparative study on reordering constraints in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Phrasal cohesion and statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Aligning words using matrix factorisation

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Empirical lower bounds on the complexity of translational equivalence

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Optimal constituent alignment with edge covers for semantic projection

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Syntax-based alignment: supervised or unsupervised?

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Translating with non-contiguous phrases

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Some computational complexity results for synchronous context-free grammars

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Purest ever example-based machine translation: Detailed presentation and assessment

Machine Translation
Measuring Word Alignment Quality for Statistical Machine Translation

Computational Linguistics
Probabilistic synchronous tree-adjoining grammars for machine translation: the argument from bilingual dictionaries

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Empirical lower bounds on alignment error rates in syntax-based machine translation

SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
On the complexity of alignment problems in two synchronous grammar formalisms

SSST '09 Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation
Machine translation as lexicalized parsing with hooks

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology

Accurate non-hierarchical phrase-based translation

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Complete search space exploration for SITG inside probability

SSPR&SPR'10 Proceedings of the 2010 joint IAPR international conference on Structural, syntactic, and statistical pattern recognition
A $${\mathcal{O}(|G|n^6)}$$ time extension of inversion transduction grammars

Machine Translation
Maximum-entropy word alignment and posterior-based phrase extraction for machine translation

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Empirical lower bounds studies in which the frequency of alignment configurations that cannot be induced by a particular formalism is estimated, have been important for the development of syntax-based machine translation formalisms. The formalism that has received most attention has been inversion transduction grammars (ITGs) (Wu, 1997). All previous work on the coverage of ITGs, however, concerns parse failure rates (PFRs) or sentence level coverage, which is not directly related to any of the evaluation measures used in machine translation. Søgaard and Kuhn (2009) induce lower bounds on translation unit error rates (TUERs) for a number of formalisms, incl. normal form ITGs, but not for the full class of ITGs. Many of the alignment configurations that cannot be induced by normal form ITGs can be induced by unrestricted ITGs, however. This paper estimates the difference and shows that the average reduction in lower bounds on TUER is 2.48 in absolute difference (16.01 in average parse failure rate).