Feature-based pronunciation modeling for speech recognition

Authors:
Karen Livescu;James Glass
Affiliations:
MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA
Venue:
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Year:
2004

Citing 2
Cited 3

A model for reasoning about persistence and causation

Computational Intelligence
Production models as a structural basis for automatic speech recognition

Speech Communication - Special issue on speech production: models and data

Articulatory features for robust visual speech recognition

Proceedings of the 6th international conference on Multimodal interfaces
Point process models for event-based speech recognition

Speech Communication
Sequence-based pronunciation modeling using a noisy-channel approach

IWSDS'10 Proceedings of the Second international conference on Spoken dialogue systems for ambient environments

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an approach to pronunciation modeling in which the evolution of multiple linguistic feature streams is explicitly represented. This differs from phone-based models in that pronunciation variation is viewed as the result of feature asynchrony and changes in feature values, rather than phone substitutions, insertions, and deletions. We have implemented a flexible feature-based pronunciation model using dynamic Bayesian networks. In this paper, we describe our approach and report on a pilot experiment using phonetic transcriptions of utterances from the Switchboard corpus. The experimental results, as well as the model's qualitative behavior, suggest that this is a promising way of accounting for the types of pronunciation variation often seen in spontaneous speech.