Penalized logistic regression with HMM log-likelihood regressors for speech recognition

Authors:
Øystein Birkenes;Tomoko Matsui;Kunio Tanabe;Sabato Marco Siniscalchi;Tor André Myrvoll;Magne Hallstein Johnsen
Affiliations:
TANDBERG, Lysaker, Norway and Norwegian University of Science and Technology, Trondheim, Norway;Institute of Statistical Mathematics, Tokyo, Japan;Science and Engineering Department, Waseda University, Tokyo, Japan;University of Palermo, Palermo, Italy and Norwegian University of Science and Technology, Trondheim, Norway;SINTEF, Trondheim, Norway and Norwegian University of Science and Technology, Trondheim, Norway;Department of Electronics and Telecommunications, Signal Processing Group, Norwegian University of Science and Technology, Trondheim, Norway
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2010

Citing 6
Cited 2

The nature of statistical learning theory

The nature of statistical learning theory
Exploiting generative models in discriminative classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Phone Classification with Segmental Features and a Binary-Pair Partitioned Neural Network Classifier

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
On the use of support vector machines for phonetic classification

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error

IEEE Transactions on Audio, Speech, and Language Processing

Combining speech attribute detection and penalized logistic regression for phoneme recognition

Neurocomputing
Exploiting deep neural networks for detection-based speech recognition

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hidden Markov models (HMMs) are powerful generative models for sequential data that have been used in automatic speech recognition for more than two decades. Despite their popularity, HMMs make inaccurate assumptions about speech signals, thereby limiting the achievable performance of the conventional speech recognizer. Penalized logistic regression (PLR) is a well-founded discriminative classifier with long roots in the history of statistics. Its classification performance is often compared with that of the popular support vector machine (SVM). However, for speech classification, only limited success with PLR has been reported, partially due to the difficulty with sequential data. In this paper, we present an elegant way of incorporating HMMs in the PLR framework. This leads to a powerful discriminative classifier that naturally handles sequential data. In this approach, speech classification is done using affine combinations of HMM log-likelihoods. We believe that such combinations of HMMs lead to a more accurate classifier than the conventional HMM-based classifier. Unlike similar approaches, we jointly estimate the HMM parameters and the PLR parameters using a single training criterion. The extension to continuous speech recognition is done via rescoring of N-best lists or lattices.