The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
A joint source-channel model for machine transliteration
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
A generic framework for machine transliteration
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Joint-sequence models for grapheme-to-phoneme conversion
Speech Communication
A deep learning approach to machine transliteration
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Transliteration by bidirectional statistical machine translation
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Modeling machine transliteration as a phrase based statistical machine translation problem
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Whitepaper of NEWS 2010 shared task on transliteration generation
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Report of NEWS 2010 transliteration generation shared task
NEWS '10 Proceedings of the 2010 Named Entities Workshop
How do you pronounce your name?: improving G2P with transliterations
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Leveraging supplemental representations for sequential transduction
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Hi-index | 0.00 |
The system presented in this paper uses a combination of two techniques to directly transliterate from grapheme to grapheme. The technique makes no language specific assumptions, uses no dictionaries or explicit phonetic information; the process transforms sequences of tokens in the source language directly into to sequences of tokens in the target. All the language pairs in our experiments were transliterated by applying this technique in a single unified manner. The approach we take is that of hypothesis rescoring to integrate the models of two state-of-the-art techniques: phrase-based statistical machine translation (SMT), and a joint multigram model. The joint multigram model was used to generate an n-best list of transliteration hypotheses that were re-scored using the models of the phrase-based SMT system. The both of the models' scores for each hypothesis were linearly interpolated to produce a final hypothesis score that was used to re-rank the hypotheses. In our experiments on development data, the combined system was able to outperform both of its component systems substantially.