Rescoring a phrase-based machine transliteration system with recurrent neural network language models

Authors:
Andrew Finch;Paul Dixon;Eiichiro Sumita
Affiliations:
NICT, Hikaridai Keihanna Science City, Japan;NICT, Hikaridai Keihanna Science City, Japan;NICT, Hikaridai Keihanna Science City, Japan
Venue:
NEWS '12 Proceedings of the 4th Named Entity Workshop
Year:
2012

Citing 6
Cited 2

Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A generic framework for machine transliteration

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Nonparametric Bayesian machine transliteration with synchronous adaptor grammars

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
English to persian transliteration

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Whitepaper of NEWS 2012 shared task on machine transliteration

NEWS '12 Proceedings of the 4th Named Entity Workshop

Report of NEWS 2012 machine transliteration shared task

NEWS '12 Proceedings of the 4th Named Entity Workshop
A Bayesian Alignment Approach to Transliteration Mining

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The system entered into this year's shared transliteration evaluation is implemented within a phrase-based statistical machine transliteration (SMT) framework. The system is based on a joint source-channel model in combination with a target language model and models to control the length of the sequences generated. The joint source-channel model was trained using a many-to-many Bayesian bilingual alignment. The focus of this year's system is on input representation. In order attempt to mitigate data sparseness issues in the joint source-channel model, we augmented the system with recurrent neural network (RNN) models that can learn to project the grapheme set onto a smaller hidden representation. We performed experiments on development data to evaluate the effectiveness of our approach. Our results show that using an RNN language model can improve performance for language pairs with large grapheme sets on the target side.