Subphonetic modeling with Markov states: senone

Authors:
Mei-Yuh Hwang;Xuedong Huang
Affiliations:
School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania;School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 6
Cited 4

Speech recognition in SRI's resource management and ATIS systems

HLT '91 Proceedings of the workshop on Speech and Natural Language
Improved acoustic modeling for continuous speech recognition

HLT '90 Proceedings of the workshop on Speech and Natural Language
The Lincoln tied-mixture HMM continuous speech recognizer

HLT '90 Proceedings of the workshop on Speech and Natural Language
Hidden Markov Models for Speech Recognition

Hidden Markov Models for Speech Recognition
Automatic phonetic baseform determination

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Improved acoustic modeling with the SPHINX speech recognition system

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference

Speaker-independent spelling recognition over the telephone

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Multi-speaker/speaker-independent architectures for the multi-state time delay neural network

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Predicting unseen triphones with senones

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Acoustic modeling problem for automatic speech recognition system: conventional methods (Part I)

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

We will never have sufficient training data to model all the various acoustic-phonetic phenomena. How to capture important clues and estimate those needed parameters reliably is one of the central issues in speech recognition. Successful examples include subword models, fenones and many other smoothing techniques. In comparison with subword models, subphonetic modeling may provide a finer level of details. We propose to model subphonetic events with Markov states and treat the state in phonetic hidden Markov models as our basic subphonetic unit - senone. Senones generalize fenones in several ways. A word model is a concatenation of senones and senones can be shared across different word models. Senone models not only allow parameter sharing, but also enable pronunciation optimization. In this paper, we will report preliminary senone modeling results, which have significantly reduced the word error rate for speaker-independent continuous speech recognition.