Evaluation of segmental unit input HMM

Authors:
S. Nakagawa;K. Yamamoto
Affiliations:
Dept. of Inf. & Comput. Sci., Toyohashi Univ. of Technol., Japan;Sch. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA
Venue:
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Year:
1996

Citing 0
Cited 3

Japanese spoken document retrieval considering OOV keywords using LVCSR system with OOV detection processing

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Linear Discriminant Analysis Using a Generalized Mean of Class Covariances and Its Application to Speech Recognition

IEICE - Transactions on Information and Systems
An empirical study on multiple LVCSR model combination by machine learning

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers

Quantified Score

Hi-index	0.00

Visualization

Abstract

The standard HMM cannot fully express the time variant features while staying at the same state. So as not to ignore the dynamic changes of the speech characteristics, various methods have been studied. In this paper, we compare a segmental unit input HMM where several successive frames are combined and become an input vector, with conditional density HMM or the use of regression coefficients and evaluate them. Using segmental statistics, since the dimension of the parameters increases, results in a lesser precision in estimation of the covariance matrix. Therefore we used methods for compressing dimension and reducing computation by K-L expansion and MQDF. By segmental unit inputting for the basic structure HMM, we got a better recognition rate than by traditional methods and the combination of a segmental unit of successive mel-cepstrum frames and regression coefficients showed the best recognition rate.