G2P conversion of proper names using word origin information

Authors:
Sonjia Waxmonsky;Sravana Reddy
Affiliations:
The University of Chicago, Chicago, IL;The University of Chicago, Chicago, IL
Venue:
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Year:
2012

Citing 5
Cited 0

Joint-sequence models for grapheme-to-phoneme conversion

Speech Communication
Learning multi character alignment rules and classification of training data for transliteration

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Improving transliteration accuracy using word-origin detection and lexicon lookup

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Language identification of names with SVMs

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Latent class transliteration based on source language origin

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Motivated by the fact that the pronunciation of a name may be influenced by its language of origin, we present methods to improve pronunciation prediction of proper names using word origin information. We train grapheme-to-phoneme (G2P) models on language-specific data sets and interpolate the outputs. We perform experiments on US surnames, a data set where word origin variation occurs naturally. Our methods can be used with any G2P algorithm that outputs posterior probabilities of phoneme sequences for a given word.