Feature-based pronunciation modeling for speech recognition

  • Authors:
  • Karen Livescu;James Glass

  • Affiliations:
  • MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA

  • Venue:
  • HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an approach to pronunciation modeling in which the evolution of multiple linguistic feature streams is explicitly represented. This differs from phone-based models in that pronunciation variation is viewed as the result of feature asynchrony and changes in feature values, rather than phone substitutions, insertions, and deletions. We have implemented a flexible feature-based pronunciation model using dynamic Bayesian networks. In this paper, we describe our approach and report on a pilot experiment using phonetic transcriptions of utterances from the Switchboard corpus. The experimental results, as well as the model's qualitative behavior, suggest that this is a promising way of accounting for the types of pronunciation variation often seen in spontaneous speech.