SVMs for automatic speech recognition: a survey

Authors:
R. Solera-Ureña;J. Padrell-Sendra;D. Martín-Iglesias;A. Gallardo-Antolín;C. Peláez-Moreno;F. Díaz-De-María
Affiliations:
Signal Theory and Communications Department, EPS-Universidad Carlos III de Madrid, Madrid, Spain;Signal Theory and Communications Department, EPS-Universidad Carlos III de Madrid, Madrid, Spain;Signal Theory and Communications Department, EPS-Universidad Carlos III de Madrid, Madrid, Spain;Signal Theory and Communications Department, EPS-Universidad Carlos III de Madrid, Madrid, Spain;Signal Theory and Communications Department, EPS-Universidad Carlos III de Madrid, Madrid, Spain;Signal Theory and Communications Department, EPS-Universidad Carlos III de Madrid, Madrid, Spain
Venue:
Progress in nonlinear speech processing
Year:
2007

Citing 15
Cited 2

Practical methods of optimization; (2nd ed.)

Practical methods of optimization; (2nd ed.)
A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The nature of statistical learning theory

The nature of statistical learning theory
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
Connectionist Speech Recognition: A Hybrid Approach

Connectionist Speech Recognition: A Hybrid Approach
Hybrid HMM-NN Architectures for Connected Digit Recognition

IJCNN '00 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 5 - Volume 5
Probability Estimates for Multi-class Classification by Pairwise Coupling

The Journal of Machine Learning Research
Continuous speech recognition using linked predictive neural networks

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Robust ASR using Support Vector Machines

Speech Communication
On the use of support vector machines for phonetic classification

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Client dependent GMM-SVM models for speaker verification

ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing
A speech recognizer based on multiclass SVMs with HMM-Guided segmentation

NOLISP'05 Proceedings of the 3rd international conference on Non-Linear Analyses and Algorithms for Speech Processing
Applications of support vector machines to speech recognition

IEEE Transactions on Signal Processing
Weighted least squares training of support vector classifiers leading to compact and adaptive schemes

IEEE Transactions on Neural Networks
A comparison of methods for multiclass support vector machines

IEEE Transactions on Neural Networks

Single-class support vector machine for an out-of-vocabulary rejection of isolated words

ROBIO'09 Proceedings of the 2009 international conference on Robotics and biomimetics
Extension of a Kernel-Based Classifier for Discriminative Spoken Keyword Spotting

Neural Processing Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the preponderance of Markov Models is a fact. During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed. These characteristics have made SVMs very popular and successful. In this chapter we discuss their strengths and weakness in the ASR context and make a review of the current state-of-the-art techniques. We organize the contributions in two parts: isolated-word recognition and continuous speech recognition. Within the first part we review several techniques to produce the fixed-dimension vectors needed for original SVMs. Afterwards we explore more sophisticated techniques based on the use of kernels capable to deal with sequences of different length. Among them is the DTAK kernel, simple and effective, which rescues an old technique of speech recognition: Dynamic Time Warping (DTW). Within the second part, we describe some recent approaches to tackle more complex tasks like connected digit recognition or continuous speech recognition using SVMs. Finally we draw some conclusions and outline several ongoing lines of research.