Acoustic variability and automatic recognition of children's speech

Authors:
Matteo Gerosa;Diego Giuliani;Fabio Brugnara
Affiliations:
ITC-IRST, Centro per la Ricerca Scientifica e Tecnologica, I-38050 Povo, Trento, Italy;ITC-IRST, Centro per la Ricerca Scientifica e Tecnologica, I-38050 Povo, Trento, Italy;ITC-IRST, Centro per la Ricerca Scientifica e Tecnologica, I-38050 Povo, Trento, Italy
Venue:
Speech Communication
Year:
2007

Citing 12
Cited 4

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
Speech patterns of children and adults elicited via a picture-naming task: an acoustic study

Speech Communication
Tree-based state tying for high accuracy acoustic modelling

HLT '94 Proceedings of the workshop on Human Language Technology
Towards robustness to fast speech in ASR

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Speaker normalization on conversational telephone speech

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
A parametric approach to vocal tract length normalization

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
A study of speech recognition for children and the elderly

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Speaker normalization using efficient frequency warping procedures

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Creation of two children's speech databases

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Improved methods for vocal tract normalization

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Advances in children's speech recognition within an interactive literacy tutor

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Improved automatic speech recognition through speaker normalization

Computer Speech and Language

Towards age-independent acoustic modeling

Speech Communication
Assessment of emerging reading skills in young native speakers and language learners

Speech Communication
A review of ASR technologies for children's speech

Proceedings of the 2nd Workshop on Child, Computer and Interaction
Exploring the effect of differences in the acoustic correlates of adults' and children's speech in the context of automatic speech recognition

EURASIP Journal on Audio, Speech, and Music Processing - Special issue on atypical speech

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents several acoustic analyses carried out on read speech collected from Italian children aged from 7 to 13 years and North American children aged from 5 to 17 years. These analyses aimed at achieving a better understanding of spectral and temporal changes in speech produced by children of various ages in view of the development of automatic speech recognition applications. The results of these analyses confirm and complement the results reported in the literature, showing that characteristics of children's speech change with age and that spectral and temporal variability decrease as age increases. In fact, younger children show a substantially higher intra- and inter-speaker variability with respect to older children and adults. We investigated the use of several methods for speaker adaptive acoustic modeling to cope with inter-speaker spectral variability and to improve recognition performance for children. These methods proved to be effective in recognition of read speech with a vocabulary of about 11k words.