A study of speech recognition for children and the elderly

  • Authors:
  • J. G. Wilpon;C. N. Jacobsen

  • Affiliations:
  • AT&TBell Labs., Murray Hill, NJ, USA;AT&TBell Labs., Murray Hill, NJ, USA

  • Venue:
  • ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although children and the elderly have obvious needs for voice operated interfaces, hardly anything is known about the performance of the current automatic speech recognition technology with these people. In this paper we report the results of a thorough investigation into this field using a connected digit recognizer and a major telephone speech database. One would generally assume that the recognition of speech from these people would only be a matter of having enough, sufficiently representative training data. This turns out to be true only, as long as the speakers belong to the age range 15 to approximately 70. Outside this range the error rates increase dramatically, even with balanced amounts of training data. For males, the lower limit is very sharp and can be attributed to the change of pitch frequency during puberty. For females, the lower limit is gradual and caused by the slowly changing dimensions of the vocal tract length only. For both genders, the upper limit is very gradual and can possibly be attributed to changes in the glottis area and the internal control loops of the human articulatory system. The paper presents some supporting evidence for the above assertions and gives results for various attempts to improve the performance. Recognition of children and the elderly will require much more research if we are to fully understand the characteristics of these age group on current and future speech recognition systems.