Automatic speech recognition and speech variability: A review
Speech Communication
Hi-index | 0.00 |
Speech recognition performance degrades significantly when there is a mismatch between testing and training conditions. Linear transformation-based maximum-likelihood (ML) techniques have been proposed recently to tackle this problem. We extend this approach to use nonlinear transformations. These are implemented by multilayer perceptrons (MLPs) which transform the Gaussian means. We derive a generalized expectation-maximization (GEM) training algorithm to estimate the MLP weights. Some preliminary experimental results on nonnative speaker adaptation are presented.