Pronunciation extraction from phoneme sequences through cross-lingual word-to-phoneme alignment

  • Authors:
  • Felix Stahlberg;Tim Schlippe;Stephan Vogel;Tanja Schultz

  • Affiliations:
  • Cognitive Systems Lab., Karlsruhe Institute of Technology, Karlsruhe, Germany;Cognitive Systems Lab., Karlsruhe Institute of Technology, Karlsruhe, Germany;Qatar Computing Research Institute, Qatar Foundation, Doha, Qatar;Cognitive Systems Lab., Karlsruhe Institute of Technology, Karlsruhe, Germany

  • Venue:
  • SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

With the help of written translations in a source language, we cross-lingually segment phoneme sequences in a target language into word units using our new alignment model Model 3P [17]. From this, we deduce phonetic transcriptions of target language words, introduce the vocabulary in terms of word IDs, and extract a pronunciation dictionary. Our approach is highly relevant to bootstrap dictionaries from audio data for Automatic Speech Recognition and bypass the written form in Speech-to-Speech Translation, particularly in the context of under-resourced languages, and those which are not written at all. Analyzing 14 translations in 9 languages to build a dictionary for English shows that the quality of the resulting dictionary is better in case of close vocabulary sizes in source and target language, shorter sentences, more word repetitions, and formal equivalent translations.