Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners
IEEE Transactions on Pattern Analysis and Machine Intelligence
Fundamentals of statistical signal processing: estimation theory
Fundamentals of statistical signal processing: estimation theory
Multi-Resolution Phonetic/Segmental Features and Models for HMM-Based Speech Recognition
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Machine Learning: Discriminative and Generative (Kluwer International Series in Engineering and Computer Science)
Improvements in the stochastic segment model for Phoneme recognition
HLT '89 Proceedings of the workshop on Speech and Natural Language
A variable duration acoustic segment HMM for hard-to-recognize words and phrases
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Detection of stop landmarks using Gaussian mixture modeling of speech spectrum
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Asymptotically Optimal Approximation of Multidimensional pdf's by Lower Dimensional pdf's
IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing
The PDF projection theorem and the class-specific method
IEEE Transactions on Signal Processing
Class-specific feature sets in classification
IEEE Transactions on Signal Processing
Sufficiency, classification, and the class-specific feature theorem
IEEE Transactions on Information Theory
Hi-index | 35.68 |
We apply the PDF projection theorem to generalize the hidden Markov model (HMM) to accommodate multiple simultaneous segmentations of the raw data and multiple feature extraction transformations. Different segment sizes and feature transformations are assigned to each state. The algorithm averages over all allowable segmentations by mapping the segmentations to a "proxy" HMM and using the forward procedure. A by-product of the algorithm is the set of a posteriori state probability estimates that serve as a description of the input data. These probabilities have simultaneously the temporal resolution of the smallest processing windows and the processing gain and frequency resolution of the largest processing windows. The method is demonstrated on the problem of precisely modeling the consonant "T" in order to detect the presence of a distinct "burst" component. We compare the algorithm against standard speech analysis methods using data from the TIMIT corpus.