Ageing voices: the effect of changes in voice parameters on ASR performance

Authors:
Ravichander Vipperla;Steve Renals;Joe Frankel
Affiliations:
The Center for Speech Technology Research, School of Informatics, University of Edinburgh, Edinburgh, UK;The Center for Speech Technology Research, School of Informatics, University of Edinburgh, Edinburgh, UK;The Center for Speech Technology Research, School of Informatics, University of Edinburgh, Edinburgh, UK
Venue:
EURASIP Journal on Audio, Speech, and Music Processing - Special issue on atypical speech
Year:
2010

Citing 8
Cited 4

Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones

Speech Communication
Speech and Language Processing (2nd Edition)

Speech and Language Processing (2nd Edition)
A study of speech recognition for children and the elderly

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
The application of hidden Markov models in speech recognition

Foundations and Trends in Signal Processing
A Study of Acoustic Correlates of Speaker Age

Speaker Classification II
Being Old Doesn’t Mean Acting Old: How Older Users Interact with Spoken Dialog Systems

ACM Transactions on Accessible Computing (TACCESS)
Speech Input from Older Users in Smart Environments: Challenges and Perspectives

UAHCI '09 Proceedings of the 5th International on ConferenceUniversal Access in Human-Computer Interaction. Part II: Intelligent and Ubiquitous Interaction Environments
The 2005 AMI system for the transcription of speech in meetings

MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction

Effects of long-term ageing on speaker verification

BioID'11 Proceedings of the COST 2101 European conference on Biometrics and ID management
Aging speech recognition with speaker adaptation techniques: Study on medium vocabulary continuous Bengali speech

Pattern Recognition Letters
Effect of aging on speech features and phoneme recognition: a study on Bengali voicing vowels

International Journal of Speech Technology
Speaker verification in score-ageing-quality classification space

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

With ageing, human voices undergo several changes which are typically characterized by increased hoarseness and changes in articulation patterns. In this study, we have examined the effect on Automatic Speech Recognition (ASR) and found that the Word Error Rates (WER) on older voices is 10% absolute higher compared to those of adult voices. Subsequently, we compared several voice source parameters including fundamental frequency, jitter, shimmer, harmonicity, and cepstral peak prominence of adult and older males. Several of these parameters show statistically significant difference for the two groups. However, artificially increasing jitter and shimmer measures do not effect the ASR accuracies significantly. Artificially lowering the fundamental frequency degrades the ASR performance marginally but this drop in performance can be overcome to some extent using Vocal Tract Length Normalisation (VTLN). Overall, we observe that the changes in the voice source parameters do not have a significant impact on ASR performance. Comparison of the likelihood scores of all the phonemes for the two age groups show that there is a systematic mismatch in the acoustic space of the two age groups. Comparison of the phoneme recognition rates show that mid vowels, nasals, and phonemes that depend on the ability to create constrictions with tongue tip for articulation are more affected by ageing than other phonemes.