Two-Stage Hypotheses Generation for Spoken Language Translation

Authors:
Boxing Chen;Min Zhang;Ai Ti Aw
Affiliations:
Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore
Venue:
ACM Transactions on Asian Language Information Processing (TALIP)
Year:
2009

Citing 29
Cited 0

Machine Translation: A Knowledge-Based Approach

Machine Translation: A Knowledge-Based Approach
The EuTrans Spoken Language Translation System

Machine Translation
Nine Issues in Speech Translation

Machine Translation
A systematic comparison of various statistical alignment models

Computational Linguistics
Finite-State Speech-to-Speech Translation

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
Hybrid Language Processing in the Spoken Language Translator

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
MARS: A Statistical Semantic Parsing and Generation-Based Multilingual Automatic tRanslation System

Machine Translation
Stochastic Finite-State Models for Spoken Language MachineTranslation

Machine Translation
Models of translational equivalence among words

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Three heads are better than one

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Spoken-language translation method using examples

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
A corpus-centered approach to spoken language translation

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Architecture and design considerations in NESPOLE!: a speech translation system for e-commerce applications

HLT '01 Proceedings of the first international conference on Human language technology research
Interlingua-based broad-coverage Korean-to-English translation in CCLINC

HLT '01 Proceedings of the first international conference on Human language technology research
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
PANGLOSS: knowledge-based machine translation

HLT '94 Proceedings of the workshop on Human Language Technology
Architectures for speech-to-speech translation using finite-state models

S2S '02 Proceedings of the ACL-02 workshop on Speech-to-speech translation: algorithms and systems - Volume 7
JANUS: a speech-to-speech translation system using connectionist and symbolic processing strategies

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
A localized prediction model for statistical machine translation

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Combining clues for lexical level aligning using the null hypothesis approach

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Do we need phrases?: challenging the conventional wisdom in statistical machine translation

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Word-Level Confidence Estimation for Machine Translation

Computational Linguistics
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Regenerating hypotheses for statistical machine translation

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Spoken Language Translation (SLT) is the research area that focuses on the translation of speech or text between two spoken languages. Phrase-based and syntax-based methods represent the state-of-the-art for statistical machine translation (SMT). The phrase-based method specializes in modeling local reorderings and translations of multiword expressions. The syntax-based method is enhanced by using syntactic knowledge, which can better model long word reorderings, discontinuous phrases, and syntactic structure. In this article, we leverage on the strength of these two methods and propose a strategy based on multiple hypotheses generation in a two-stage framework for spoken language translation. The hypotheses are generated in two stages, namely, decoding and regeneration. In the decoding stage, we apply state-of-the-art, phrase-based, and syntax-based methods to generate basic translation hypotheses. Then in the regeneration stage, much more hypotheses that cannot be captured by the decoding algorithms are produced from the basic hypotheses. We study three regeneration methods: redecoding, n-gram expansion, and confusion network in the second stage. Finally, an additional reranking pass is introduced to select the translation outputs by a linear combination of rescoring models. Experimental results on the Chinese-to-English IWSLT-2006 challenge task of translating the transcription of spontaneous speech show that the proposed mechanism achieves significant improvements over the baseline of about 2.80 BLEU-score.