Statistical modelling in continuous speech recognition (CSR)

Authors:
Steve Young
Affiliations:
Cambridge University Engineering Dept., Cambridge, England
Venue:
UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Year:
2001

Citing 12
Cited 1

Class-based n-gram models of natural language

Computational Linguistics
Towards increasing speech recognition error rates

Speech Communication
MMIE training of large vocabulary recognition systems

Speech Communication
A unifying review of linear Gaussian models

Neural Computation
Mixed Memory Markov Models: Decomposing Complex Stochastic Processes as Mixtures of Simpler Ones

Machine Learning
Stochastic pronunciation modelling from hand-labelled phonetic corpora

Speech Communication - Special issue on modeling pronunciation variation for automatic speech recognition
Tree-based state tying for high accuracy acoustic modelling

HLT '94 Proceedings of the workshop on Human Language Technology
A one pass decoder design for large vocabulary recognition

HLT '94 Proceedings of the workshop on Human Language Technology
An investigation of PLP and IMELDA acoustic representations and of their potential for combination

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Probabilistic classification of HMM states for large vocabulary continuous speech recognition

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
The 1998 HTK system for transcription of conversational telephone speech

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 01
SWITCHBOARD: telephone speech corpus for research and development

ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1

A novel template matching approach to speaker-independent arabic spoken digit recognition

AIS'12 Proceedings of the Third international conference on Autonomous and Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic continuous speech recognition (CSR) is sufficiently mature that a variety of real world applications are now possible including large vocabulary transcription and interactive spoken dialogues. This paper reviews the evolution of the statistical modelling techniques which underlie current-day systems, specifically hidden Markov models (HMMs) and N-grams. Starting from a description of the speech signal and its parameterisation, the various modelling assumptions and their consequences are discussed. It then describes various techniques by which the effects of these assumptions can be mitigated. Despite the progress that has been made, ther limitations of current modelling techniques are still evident. The paper therefore concludes with a brief review of some of the more fundamental modelling work now in progress.