Aging speech recognition with speaker adaptation techniques: Study on medium vocabulary continuous Bengali speech

Authors:
Biswajit Das;Sandipan Mandal;Pabitra Mitra;Anupam Basu
Affiliations:
Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur 721302, West Bengal, India;Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur 721302, West Bengal, India;Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur 721302, West Bengal, India;Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur 721302, West Bengal, India
Venue:
Pattern Recognition Letters
Year:
2013

Citing 16
Cited 0

Speech recognition using hidden Markov models: a CMU perspective

Speech Communication
MMIE training of large vocabulary recognition systems

Speech Communication
MAP estimation of continuous density HMM: theory and applications

HLT '91 Proceedings of the workshop on Speech and Natural Language
Efficient cepstral normalization for robust speech recognition

HLT '93 Proceedings of the workshop on Human Language Technology
Speaker normalization on conversational telephone speech

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
A parametric approach to vocal tract length normalization

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
A study of speech recognition for children and the elderly

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Speaker normalization using efficient frequency warping procedures

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Improved methods for vocal tract normalization

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
A stochastic version of Expectation Maximization algorithm for better estimation of Hidden Markov Model

Pattern Recognition Letters
Improved automatic speech recognition through speaker normalization

Computer Speech and Language
A constraint-based evolutionary learning approach to the expectation maximization for optimal estimation of the hidden Markov model for speech signal modeling

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on human computing
Ageing voices: the effect of changes in voice parameters on ASR performance

EURASIP Journal on Audio, Speech, and Music Processing - Special issue on atypical speech
Linear discriminant analysis for improved large vocabulary continuous speech recognition

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Speaker adaptation based on MAP estimation of HMM parameters

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Developing Bengali Speech Corpus for Phone Recognizer Using Optimum Text Selection Technique

IALP '11 Proceedings of the 2011 International Conference on Asian Language Processing

Quantified Score

Hi-index	0.10

Visualization

Abstract

The article describes the speech recognition system development in Bengali language for aging population with various adaptation techniques. Variability in acoustic characteristics among different speakers degrades speech recognition accuracy. In general, perceptual as well as acoustical variations exists among speakers, but variations are more pronounced between young and aged population. Deviation in voice source features between two age groups, affect the speech recognition performance. Existing automatic speech recognition algorithms demands large amount of training data with all variability to develop a robust speech recognition system. However, speaker normalization and adaptation techniques attempts to reduce inter-speaker or intra-speaker acoustic variability without having large amount of training data. Here, conventional acoustic model adaptation method e.g. vocal tract length normalization, maximum likelihood linear regression and/or maximum a posteriori are combined in the current study to improve recognition accuracy. Moreover, maximum mutual information estimation technique has been implemented in this study.