Cross-lingual language modeling with syntactic reordering for low-resource speech recognition

Authors:
Ping Xu;Pascale Fung
Affiliations:
The Hong Kong University of Science and Technology, Hong Kong;The Hong Kong University of Science and Technology, Hong Kong
Venue:
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Year:
2012

Citing 14
Cited 0

Permutations, parenthesis words, and Schro¨der numbers

Discrete Mathematics
A systematic comparison of various statistical alignment models

Computational Linguistics
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora

Computational Linguistics
Decoding complexity in word-replacement translation models

Computational Linguistics
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A comparative study on reordering constraints in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Machine Translation with Inferred Stochastic Finite-State Transducers

Computational Linguistics
Cross-lingual lexical triggers in statistical language modeling

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A weighted finite state transducer translation template model for statistical machine translation

Natural Language Engineering
Reordering constraints for phrase-based statistical machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Local phrase reordering models for statistical machine translation

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Comparing reordering constraints for SMT using efficient Bleu oracle computation

SSST '07 Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation
Novel reordering approaches in phrase-based statistical machine translation

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Improved statistical machine translation for resource-poor languages using related resource-rich languages

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes cross-lingual language modeling for transcribing source resource-poor languages and translating them into target resource-rich languages if necessary. Our focus is to improve the speech recognition performance of low-resource languages by leveraging the language model statistics from resource-rich languages. The most challenging work of cross-lingual language modeling is to solve the syntactic discrepancies between the source and target languages. We therefore propose syntactic reordering for cross-lingual language modeling, and present a first result that compares inversion transduction grammar (ITG) reordering constraints to IBM and local constraints in an integrated speech transcription and translation system. Evaluations on resource-poor Cantonese speech transcription and Cantonese to resource-rich Mandarin translation tasks show that our proposed approach improves the system performance significantly, up to 3.4% relative WER reduction in Cantonese transcription and 13.3% relative bilingual evaluation understudy (BLEU) score improvement in Mandarin transcription compared with the system without reordering.