Toward Robust Speech Recognition and Understanding

Authors:
Sadaoki Furui
Affiliations:
Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan 152-8552
Venue:
Journal of VLSI Signal Processing Systems
Year:
2005

Citing 4
Cited 1

Fundamentals of speech recognition

Fundamentals of speech recognition
Parallel Computing-Based Architecture for Mixed-Initiative Spoken Dialogue

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
On-line incremental speaker adaptation with automatic speaker change detection

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
A statistical approach to automatic speech summarization

EURASIP Journal on Applied Signal Processing

Nonlinear enhancement of noisy speech, using continuous attractor dynamics formed in recurrent neural networks

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The principal cause of speech recognition errors is a mismatch between trained acoustic/language models and input speech due to the limited amount of training data in comparison with the vast variation of speech. It is crucial to establish methods that are robust against voice variation due to individuality, the physical and psychological condition of the speaker, telephone sets, microphones, network characteristics, additive background noise, speaking styles, and other aspects. This paper overviews robust architecture and modeling techniques for speech recognition and understanding. The topics include acoustic and language modeling for spontaneous speech recognition, unsupervised adaptation of acoustic and language models, robust architecture for spoken dialogue systems, multi-modal speech recognition, and speech summarization. This paper also discusses the most important research problems to be solved in order to achieve ultimate robust speech recognition and understanding systems.