Learning mixture models using a genetic version of the EM algorithm
Pattern Recognition Letters
Data Mining: Opportunities and Challenges
Data Mining: Opportunities and Challenges
Articulatory features for robust visual speech recognition
Proceedings of the 6th international conference on Multimodal interfaces
Genetic-Based EM Algorithm for Learning Gaussian Mixture Models
IEEE Transactions on Pattern Analysis and Machine Intelligence
Representational bias in unsupervised learning of syllable structure
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Template-Based Continuous Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
International Journal of Speech Technology
International Journal of Speech Technology
Hi-index | 0.00 |
Recent theoretical developments in neuroscience suggest that sublexical speech processing occurs via two parallel processing pathways. According to this Dual Stream Model of Speech Processing speech is processed both as sequences of speech sounds and articulations. We attempt to revise the "beads-on-a-string" paradigm of Hidden Markov Models in Automatic Speech Recognition (ASR) by implementing a system for dual stream speech recognition. A baseline recognition system is enhanced by modeling of articulations as sequences of syllables. An efficient and complementary model to HMMs is developed by formulating Dynamic Time Warping (DTW) as a probabilistic model. The DTW Model (DTWM) is improved by enriching syllable templates with constrained covariance matrices, data imputation, clustering and mixture modeling. The resulting dual stream system is evaluated on the N-Best Southern Dutch Broadcast News benchmark. Promising results are obtained for DTWM classification and ASR tests. We provide a discussion on the remaining problems in implementing dual stream speech recognition.