Probabilistic Pronunciation Variation Model Based on Bayesian Network for Conversational Speech Recognition

Authors:
Sakriani Sakti;Konstantin Markov;Satoshi Nakamura
Affiliations:
-;-;-
Venue:
ISUC '08 Proceedings of the 2008 Second International Symposium on Universal Communication
Year:
2008

Citing 0
Cited 1

Sequence-based pronunciation modeling using a noisy-channel approach

IWSDS'10 Proceedings of the Second international conference on Spoken dialogue systems for ambient environments

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper reports on an ongoing study on modelingpronunciation variation for conversational speech recognition, in which the mapping from canonical pronunciations (baseforms) to the actual/realized phoneme (surface forms) is modeled by a Bayesian network. The advantage of this graphical model framework is that the probabilistic relationship between baseforms, surface forms, and any additional knowledge sources can be learned in a unified manner. Thus, we can easily incorporate various additional knowledge sources from different domains. In this preliminary study, we investigate the dependency of surface forms on the current, preceding and succeeding baseform phonemes, the position of current baseform phoneme in the word, and also whether or not the preceding surface phoneme was deleted. The performance of the proposed method was evaluated using spontaneous telephone conversations from a portion of the Switchboard corpus. Experimental results show that this method provides consistent improvement in word accuracy over the standard pronunciation dictionary.