Trajectory Clustering for Solving the Trajectory Folding Problem in Automatic Speech Recognition

Authors:
Y. Han;Johan de Veth;L. Boves
Affiliations:
Center for Language & Speech Technol., Radboud Univ. Nijmegen;-;-
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2007

Citing 0
Cited 3

Editorial note: Bridging the gap between human and automatic speech recognition

Speech Communication
Modelling pronunciation variation with single-path and multi-path syllable models: Issues to consider

Speech Communication
Investigation of supervised dimensionality reduction methods for phonetic classification

Proceedings of the Third International Conference on Internet Multimedia Computing and Service

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we introduce a novel method for clustering speech gestures, represented as continuous trajectories in acoustic parameter space. Trajectory Clustering allows us to avoid the conditional independence assumption that makes it difficult to account for the fact that successive measurements of an articulatory gesture are correlated. We apply the trajectory clustering method for developing multiple parallel hidden Markov models (HMMs) for a continuous digits recognition task. We compare the performance obtained with data-driven clustering to the recognition performance obtained with conventional head-body-tail models, which use knowledge-based criteria for building multiple HMMs in order to obviate the trajectory folding problem. The results show that trajectory clustering is able to discover structure in the the training database that is different from the structure assumed by the knowledge-based approach. In addition, the data-derived structure gives rise to significantly better recognition performance, and results in a 10% word error rate reduction