Speech and Language Processing (2nd Edition)
Speech and Language Processing (2nd Edition)
A study of speech recognition for children and the elderly
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
The application of hidden Markov models in speech recognition
Foundations and Trends in Signal Processing
A Study of Acoustic Correlates of Speaker Age
Speaker Classification II
Being Old Doesn’t Mean Acting Old: How Older Users Interact with Spoken Dialog Systems
ACM Transactions on Accessible Computing (TACCESS)
Speech Input from Older Users in Smart Environments: Challenges and Perspectives
UAHCI '09 Proceedings of the 5th International on ConferenceUniversal Access in Human-Computer Interaction. Part II: Intelligent and Ubiquitous Interaction Environments
The 2005 AMI system for the transcription of speech in meetings
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Effects of long-term ageing on speaker verification
BioID'11 Proceedings of the COST 2101 European conference on Biometrics and ID management
Effect of aging on speech features and phoneme recognition: a study on Bengali voicing vowels
International Journal of Speech Technology
Speaker verification in score-ageing-quality classification space
Computer Speech and Language
Hi-index | 0.00 |
With ageing, human voices undergo several changes which are typically characterized by increased hoarseness and changes in articulation patterns. In this study, we have examined the effect on Automatic Speech Recognition (ASR) and found that the Word Error Rates (WER) on older voices is 10% absolute higher compared to those of adult voices. Subsequently, we compared several voice source parameters including fundamental frequency, jitter, shimmer, harmonicity, and cepstral peak prominence of adult and older males. Several of these parameters show statistically significant difference for the two groups. However, artificially increasing jitter and shimmer measures do not effect the ASR accuracies significantly. Artificially lowering the fundamental frequency degrades the ASR performance marginally but this drop in performance can be overcome to some extent using Vocal Tract Length Normalisation (VTLN). Overall, we observe that the changes in the voice source parameters do not have a significant impact on ASR performance. Comparison of the likelihood scores of all the phonemes for the two age groups show that there is a systematic mismatch in the acoustic space of the two age groups. Comparison of the phoneme recognition rates show that mid vowels, nasals, and phonemes that depend on the ability to create constrictions with tongue tip for articulation are more affected by ageing than other phonemes.