Leveraging supplemental representations for sequential transduction

Authors:
Aditya Bhargava;Grzegorz Kondrak
Affiliations:
University of Toronto, Toronto, ON, Canada;University of Alberta, Edmonton, AB, Canada
Venue:
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Year:
2012

Citing 25
Cited 1

Learning String-Edit Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Optimizing search engines using clickthrough data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A multistrategy approach to improving pronunciation by analogy

Computational Linguistics
Machine transliteration

Computational Linguistics
Machine transliteration of names in Arabic text

SEMITIC '02 Proceedings of the ACL-02 workshop on Computational approaches to semitic languages
A joint source-channel model for machine transliteration

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Named entity transliteration with comparable corpora

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Named entity transliteration and discovery from multilingual comparable corpora

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Learning pronunciation dictionaries: language complexity and word selection strategies

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Joint-sequence models for grapheme-to-phoneme conversion

Speech Communication
Do not forget: full memory in memory-based learning of word pronunciation

NeMLaP3/CoNLL '98 Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning
Discriminative methods for transliteration

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Adding more languages improves unsupervised multilingual part-of-speech tagging: a Bayesian non-parametric approach

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Revisiting pivot language approach for machine translation

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Report of NEWS 2009 machine transliteration shared task

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
DirecTL: a language-independent approach to transliteration

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Compositional Machine Transliteration

ACM Transactions on Asian Language Information Processing (TALIP)
Everybody loves a rich cousin: an empirical study of transliteration through bridge languages

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Integrating joint n-gram features into a discriminative training framework

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Report of NEWS 2010 transliteration generation shared task

NEWS '10 Proceedings of the 2010 Named Entities Workshop
Report of NEWS 2010 transliteration mining shared task

NEWS '10 Proceedings of the 2010 Named Entities Workshop
Transliteration generation and mining with limited training resources

NEWS '10 Proceedings of the 2010 Named Entities Workshop
Transliteration using a phrase-based statistical machine translation system to re-score the output of a joint multigram model

NEWS '10 Proceedings of the 2010 Named Entities Workshop
Machine transliteration: leveraging on third languages

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
How do you pronounce your name?: improving G2P with transliterations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1

Transliteration experiments on Chinese and Arabic

NEWS '12 Proceedings of the 4th Named Entity Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequential transduction tasks, such as grapheme-to-phoneme conversion and machine transliteration, are usually addressed by inducing models from sets of input-output pairs. Supplemental representations offer valuable additional information, but incorporating that information is not straightforward. We apply a unified reranking approach to both grapheme-to-phoneme conversion and machine transliteration demonstrating substantial accuracy improvements by utilizing heterogeneous transliterations and transcriptions of the input word. We describe several experiments that involve a variety of supplemental data and two state-of-the-art transduction systems, yielding error rate reductions ranging from 12% to 43%. We further apply our approach to system combination, with error rate reductions between 4% and 9%.