Fundamentals of speech recognition
Fundamentals of speech recognition
Fundamentals of speech synthesis and speech recognition
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Discrete Time Processing of Speech Signals
Discrete Time Processing of Speech Signals
Mathematical Foundations of Speech and Language Processing
Mathematical Foundations of Speech and Language Processing
Prosodic manipulation using instants of significant excitation
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Combining Cepstral and Prosodic Features in Language Identification
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 04
Artificial Neural Networks
Modeling durations of syllables using neural networks
Computer Speech and Language
Springer Handbook of Speech Processing
Springer Handbook of Speech Processing
Intonation modeling for Indian languages
Computer Speech and Language
Vowel Onset Point Detection Using Source, Spectral Peaks, and Modulation Spectrum Energies
IEEE Transactions on Audio, Speech, and Language Processing
Prosody modification using instants of significant excitation
IEEE Transactions on Audio, Speech, and Language Processing
International Journal of Speech Technology
Integration of multiple acoustic and language models for improved Hindi speech recognition system
International Journal of Speech Technology
Emotion recognition from speech using global and local prosodic features
International Journal of Speech Technology
Characterization and recognition of emotions from speech using excitation source information
International Journal of Speech Technology
Identification of Indian languages using multi-level spectral and prosodic features
International Journal of Speech Technology
Pitch synchronous and glottal closure based speech analysis for language recognition
International Journal of Speech Technology
Hi-index | 0.00 |
In this paper we demonstrate the use of prosody models for developing speech systems in Indian languages. Duration and intonation models developed using feedforward neural networks are considered as prosody models. Labelled broadcast news data in the languages Hindi, Telugu, Tamil and Kannada is used for developing the neural network models for predicting the duration and intonation. The features representing the positional, contextual and phonological constraints are used for developing the prosody models. In this paper, the use of prosody models is illustrated using speech recognition, speech synthesis, speaker recognition and language identification applications. Autoassociative neural networks and support vector machines are used as classification models for developing the speech systems. The performance of the speech systems has shown to be improved by combining the prosodic features along with one popular spectral feature set consisting of Weighted Linear Prediction Cepstral Coefficients (WLPCCs).