Trajectory mixture density networks with multiple mixtures for acoustic-articulatory inversion

Authors:
Korin Richmond
Affiliations:
Centre for Speech Technology Research, Edinburgh University, Edinburgh, United Kingdom
Venue:
NOLISP'07 Proceedings of the 2007 international conference on Advances in nonlinear speech processing
Year:
2007

Citing 3
Cited 2

Estimating articulatory motion from speech wave

Speech Communication - Special issue: Speech research in Japan
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Acoustic to articulatory parameter mapping using an assembly of neural networks

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference

Integrating articulatory features into HMM-based parametric speech synthesis

IEEE Transactions on Audio, Speech, and Language Processing
An Analysis of HMM-based prediction of articulatory movements

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

We have previously proposed a trajectory model which is based on a mixture density network (MDN) trained with target variables augmented with dynamic features together with an algorithm for estimating maximum likelihood trajectories which respects the constraints between those features. In this paper, we have extended that model to allow diagonal covariance matrices and multiple mixture components in the trajectory MDN output probability density functions. We have evaluated this extended model on an inversion mapping task and found the trajectory model works well, outperforming smoothing of equivalent trajectories using low-pass filtering. Increasing the number of mixture components in the TMDN improves results further.