Predictor-corrector adaptation by using time evolution system with macroscopic time scale

Authors:
Shinji Watanabe;Atsushi Nakamura
Affiliations:
NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan;NTT Communication Science Laboratories, NTT Corporation, Kyoto, Japan
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2010

Citing 3
Cited 1

An Introduction to the Kalman Filter

An Introduction to the Kalman Filter
On-line adaptation and Bayesian detection of environmental changes based on a macroscopic time evolution system

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Bayesian Adaptive Inference and Adaptive Training

IEEE Transactions on Audio, Speech, and Language Processing

Prior-shared feature and model space speaker adaptation by consistently employing map estimation

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Incremental adaptation techniques for speech recognition are aimed at adjusting acoustic models to time-variant acoustic characteristics related to such factors as changes of speaker, speaking style, and noise source over time. In this paper, we propose a novel incremental adaptation framework, which models such time-variant characteristics by successively updating posterior distributions of acoustic model parameters based on a macroscopic time scale (e.g., every set of more than a dozen utterances). The proposed incremental update involves a predictor-corrector algorithm based on a macroscopic time evolution system in accordance with the Kalman filter theory. We also provide a unified interpretation of the proposal and the two major conventional approaches of indirect adaptation via transformation parameters [e.g., maximum-likelihood linear regression (MLLR)] and direct adaptation of classifier parameters [e.g., maximum a posteriori (MAP)]. We reveal analytically and experimentally that the proposed incremental adaptation realizes the predictor-corrector algorithm and involves both the conventional and their combinatorial adaptation approaches. Consequently, the proposal achieves robust recognition performance based on a balanced incremental adaptation between quickness and stability.