Fundamentals of speech recognition
Fundamentals of speech recognition
Automatic segmentation and labeling of speech based on Hidden Markov Models
Speech Communication
Connectionist Speech Recognition: A Hybrid Approach
Connectionist Speech Recognition: A Hybrid Approach
Unit selection in a concatenative speech synthesis system using a large speech database
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
The use of articulator motion information in automatic speech segmentation
Speech Communication
Speech segmentation using regression fusion of boundary predictions
Computer Speech and Language
Bimodal automatic speech segmentation based on audio and visual information fusion
Speech Communication
Text independent methods for speech segmentation
Nonlinear Speech Modeling and Applications
Analysis and HMM-based synthesis of hypo and hyperarticulated speech
Computer Speech and Language
Hi-index | 0.00 |
In this paper we compare two different methods for automatically phonetically labeling a continuous speech data-base, as usually required for designing a speech recognition or speech synthesis system. The first method is based on temporal alignment of speech on a synthetic speech pattern; the second method uses either a continuous density hidden Markov models (HMM) or a hybrid HMM/ANN (artificial neural network) system in forced alignment mode. Both systems have been evaluated on read utterances not part of the training set of the HMM systems, and compared to manual segmentation. This study outlines the advantages and drawbacks of both methods. The speech synthetic system has the great advantage that no training stage (hence no large labeled database) is needed, while HMM Systems easily handle multiple phonetic transcriptions (phonetic lattice). We deduce a method for the automatic creation of large phonetically labeled speech databases, based on using the synthetic speech segmentation tool to bootstrap the training process of either a HMM or a hybrid HMM/ANN system. The importance of such segmentation tools is a key point for the development of improved multilingual speech synthesis and recognition systems.