A model of segmental duration for speech synthesis in French
Speech Communication
Analog I/O nets for syllable timing
Speech Communication - Neurospeech
Characterisation of rhythmic patterns for text-to-speech synthesis
Speech Communication
Multilingual Text-to-Speech Synthesis
Multilingual Text-to-Speech Synthesis
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Prosody Generation with a Neural Network: Weighing the Importance of Input Parameters
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Using a sigmoid transformation for improved modeling of phoneme duration
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
Neural Network Representation for the Forces and Torque of the Eccentric Sphere Model
Transactions on Computational Science III
PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
Voice conversion by mapping the speaker-specific features using pitch synchronous approach
Computer Speech and Language
Voice transformation by mapping the features at syllable level
PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Improving phone duration modelling using support vector regression fusion
Speech Communication
Applied Computational Intelligence and Soft Computing
Application of prosody models for developing speech systems in Indian languages
International Journal of Speech Technology
Two stage emotion recognition based on speaking rate
International Journal of Speech Technology
Acoustic modeling problem for automatic speech recognition system: conventional methods (Part I)
International Journal of Speech Technology
Filterbank optimization for robust ASR using GA and PSO
International Journal of Speech Technology
Computer Speech and Language
Film segmentation and indexing using autoassociative neural networks
International Journal of Speech Technology
Identification of Indian languages using multi-level spectral and prosodic features
International Journal of Speech Technology
Hi-index | 0.00 |
In this paper, we propose a neural network model for predicting the durations of syllables. A four layer feedforward neural network trained with backpropagation algorithm is used for modeling the duration knowledge of syllables. Broadcast news data in three Indian languages Hindi, Telugu and Tamil is used for this study. The input to the neural network consists of a set of features extracted from the text. These features correspond to phonological, positional and contextual information. The relative importance of the positional and contextual features is examined separately. For improving the accuracy of prediction, further processing is done on the predicted values of the durations. We also propose a two-stage duration model for improving the accuracy of prediction. From the studies we find that 85% of the syllable durations could be predicted from the models within 25% of the actual duration. The performance of the duration models is evaluated using objective measures such as average prediction error (@m), standard deviation (@s) and correlation coefficient (@c).