Pronunciation extraction from phoneme sequences through cross-lingual word-to-phoneme alignment

Authors:
Felix Stahlberg;Tim Schlippe;Stephan Vogel;Tanja Schultz
Affiliations:
Cognitive Systems Lab., Karlsruhe Institute of Technology, Karlsruhe, Germany;Cognitive Systems Lab., Karlsruhe Institute of Technology, Karlsruhe, Germany;Qatar Computing Research Institute, Qatar Foundation, Doha, Qatar;Cognitive Systems Lab., Karlsruhe Institute of Technology, Karlsruhe, Germany
Venue:
SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
Year:
2013

Citing 5
Cited 1

A systematic comparison of various statistical alignment models

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Multilingual Speech Processing

Multilingual Speech Processing
Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Evaluation of Clusterings -- Metrics and Visual Support

ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering

Automatic speech recognition for under-resourced languages: A survey

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the help of written translations in a source language, we cross-lingually segment phoneme sequences in a target language into word units using our new alignment model Model 3P [17]. From this, we deduce phonetic transcriptions of target language words, introduce the vocabulary in terms of word IDs, and extract a pronunciation dictionary. Our approach is highly relevant to bootstrap dictionaries from audio data for Automatic Speech Recognition and bypass the written form in Speech-to-Speech Translation, particularly in the context of under-resourced languages, and those which are not written at all. Analyzing 14 translations in 9 languages to build a dictionary for English shows that the quality of the resulting dictionary is better in case of close vocabulary sizes in source and target language, shorter sentences, more word repetitions, and formal equivalent translations.