Connected digit recognition using short and long duration models

Authors:
C. Chesta;P. Laface;F. Ravera
Affiliations:
Dipt. di Autom. e Inf., Politecnico di Torino, Italy;-;-
Venue:
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Year:
1999

Citing 0
Cited 1

Automatic speech recognition and speech variability: A review

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show that accurate HMMs for connected word recognition can be obtained without context dependent modeling and discriminative training. We train two HMMs for each word that have the same, standard, left to right topology with the possibility of skipping once state, but each model has a different number of states, automatically selected. The two models account for different speaking rates that occur not only in different utterances of the speakers, but also within a connected word utterance of the same speaker. This simple modeling technique has been applied to connected digit recognition using the adult speaker portion of the TI/NIST corpus giving the best results reported so far for this database. It has also been tested on telephone speech using long sequences of Italian digits (credit card numbers), giving better results with respect to classical models with a larger number of densities.