Probabilistic Pronunciation Variation Model Based on Bayesian Network for Conversational Speech Recognition

  • Authors:
  • Sakriani Sakti;Konstantin Markov;Satoshi Nakamura

  • Affiliations:
  • -;-;-

  • Venue:
  • ISUC '08 Proceedings of the 2008 Second International Symposium on Universal Communication
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper reports on an ongoing study on modelingpronunciation variation for conversational speech recognition, in which the mapping from canonical pronunciations (baseforms) to the actual/realized phoneme (surface forms) is modeled by a Bayesian network. The advantage of this graphical model framework is that the probabilistic relationship between baseforms, surface forms, and any additional knowledge sources can be learned in a unified manner. Thus, we can easily incorporate various additional knowledge sources from different domains. In this preliminary study, we investigate the dependency of surface forms on the current, preceding and succeeding baseform phonemes, the position of current baseform phoneme in the word, and also whether or not the preceding surface phoneme was deleted. The performance of the proposed method was evaluated using spontaneous telephone conversations from a portion of the Switchboard corpus. Experimental results show that this method provides consistent improvement in word accuracy over the standard pronunciation dictionary.