Aging speech recognition with speaker adaptation techniques: Study on medium vocabulary continuous Bengali speech

  • Authors:
  • Biswajit Das;Sandipan Mandal;Pabitra Mitra;Anupam Basu

  • Affiliations:
  • Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur 721302, West Bengal, India;Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur 721302, West Bengal, India;Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur 721302, West Bengal, India;Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur 721302, West Bengal, India

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2013

Quantified Score

Hi-index 0.10

Visualization

Abstract

The article describes the speech recognition system development in Bengali language for aging population with various adaptation techniques. Variability in acoustic characteristics among different speakers degrades speech recognition accuracy. In general, perceptual as well as acoustical variations exists among speakers, but variations are more pronounced between young and aged population. Deviation in voice source features between two age groups, affect the speech recognition performance. Existing automatic speech recognition algorithms demands large amount of training data with all variability to develop a robust speech recognition system. However, speaker normalization and adaptation techniques attempts to reduce inter-speaker or intra-speaker acoustic variability without having large amount of training data. Here, conventional acoustic model adaptation method e.g. vocal tract length normalization, maximum likelihood linear regression and/or maximum a posteriori are combined in the current study to improve recognition accuracy. Moreover, maximum mutual information estimation technique has been implemented in this study.