Joint-sequence models for grapheme-to-phoneme conversion

  • Authors:
  • Maximilian Bisani;Hermann Ney

  • Affiliations:
  • Lehrstuhl für Informatik VI, RWTH Aachen University, Ahornstraíe 55, D-52056 Aachen, Germany;Lehrstuhl für Informatik VI, RWTH Aachen University, Ahornstraíe 55, D-52056 Aachen, Germany

  • Venue:
  • Speech Communication
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Grapheme-to-phoneme conversion is the task of finding the pronunciation of a word given its written form. It has important applications in text-to-speech and speech recognition. Joint-sequence models are a simple and theoretically stringent probabilistic framework that is applicable to this problem. This article provides a self-contained and detailed description of this method. We present a novel estimation algorithm and demonstrate high accuracy on a variety of databases. Moreover, we study the impact of the maximum approximation in training and transcription, the interaction of model size parameters, n-best list generation, confidence measures, and phoneme-to-grapheme conversion. Our software implementation of the method proposed in this work is available under an Open Source license.