Towards age-independent acoustic modeling

Authors:
Matteo Gerosa;Diego Giuliani;Fabio Brugnara
Affiliations:
FBK, Fondazione Bruno Kessler, I-38100 Povo, Trento, Italy;FBK, Fondazione Bruno Kessler, I-38100 Povo, Trento, Italy;FBK, Fondazione Bruno Kessler, I-38100 Povo, Trento, Italy
Venue:
Speech Communication
Year:
2009

Citing 8
Cited 2

Towards robustness to fast speech in ASR

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Speaker normalization on conversational telephone speech

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
A parametric approach to vocal tract length normalization

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
A study of speech recognition for children and the elderly

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Speaker normalization using efficient frequency warping procedures

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Improved methods for vocal tract normalization

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Acoustic variability and automatic recognition of children's speech

Speech Communication
Improved automatic speech recognition through speaker normalization

Computer Speech and Language

A review of ASR technologies for children's speech

Proceedings of the 2nd Workshop on Child, Computer and Interaction
A new approach to acoustic analysis of two British regional accents--Birmingham and Liverpool accents

International Journal of Speech Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

In automatic speech recognition applications, due to significant differences in voice characteristics, adults and children are usually treated as two population groups, for which different acoustic models are trained. In this paper, age-independent acoustic modeling is investigated in the context of large vocabulary speech recognition. Exploiting a small amount (9h) of children's speech and a more significant amount (57h) of adult speech, age-independent acoustic models are trained using several methods for speaker adaptive acoustic modeling. Recognition results achieved using these models are compared with those achieved using age-dependent acoustic models for children and adults, respectively. Recognition experiments are performed on four Italian speech corpora, two consisting of children's speech and two of adult speech, using 64k word and 11k word trigram language models. Methods for speaker adaptive acoustic modeling prove to be effective for training age-independent acoustic models ensuring recognition results at least as good as those achieved with age-dependent acoustic models for adults and children.