Vowel Imitation Using Vocal Tract Model and Recurrent Neural Network

Authors:
Hisashi Kanda;Tetsuya Ogata;Kazunori Komatani;Hiroshi G. Okuno
Affiliations:
Graduate School of Informatics, Kyoto University, Kyoto, Japan 606-8501;Graduate School of Informatics, Kyoto University, Kyoto, Japan 606-8501;Graduate School of Informatics, Kyoto University, Kyoto, Japan 606-8501;Graduate School of Informatics, Kyoto University, Kyoto, Japan 606-8501
Venue:
Neural Information Processing
Year:
2008

Citing 3
Cited 0

Speech Representation and Transformation IJsing Adaptive Interpolation of Weighted Spectrum: VOCODER Revisited

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Auditory–Motor Interaction Revealed by fMRI: Speech, Music, and Working Memory in Area Spt

Journal of Cognitive Neuroscience
Self-organization of behavioral primitives as multiple attractor dynamics: A robot experiment

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

Quantified Score

Hi-index	0.00

Visualization

Abstract

A vocal imitation system was developed using a computational model that supports the motor theory of speech perception. A critical problem in vocal imitation is how to generate speech sounds produced by adults, whose vocal tracts have physical properties (i.e., articulatory motions) differing from those of infants' vocal tracts. To solve this problem, a model based on the motor theory of speech perception, was constructed. Applying this model enables the vocal imitation system to estimate articulatory motions for unexperienced speech sounds that have not actually been generated by the system. The system was implemented by using Recurrent Neural Network with Parametric Bias (RNNPB) and a physical vocal tract model, called Maeda model. Experimental results demonstrated that the system was sufficiently robust with respect to individual differences in speech sounds and could imitate unexperienced vowel sounds.