The nature of statistical learning theory
The nature of statistical learning theory
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Phone Classification with Segmental Features and a Binary-Pair Partitioned Neural Network Classifier
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Neural Networks - 2005 Special issue: IJCNN 2005
Penalized logistic regression with HMM log-likelihood regressors for speech recognition
IEEE Transactions on Audio, Speech, and Language Processing
Bidirectional LSTM networks for improved phoneme classification and recognition
ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Maximum likelihood estimation for multivariate observations of Markov sources
IEEE Transactions on Information Theory
Hi-index | 0.01 |
Over the past few years, there has been a resurgence of interest in designing high-accuracy automatic speech recognition (ASR) systems due to the key rule they can play in many real-world applications, such as voice print for biometric identification, language identification, and call-scanning. Improving current state-of-the-art technology is therefore vital for the success of those aforementioned applications, yet this is not simple with the standard technology based on hidden Markov models (HMMs) trained on short-term spectral features. This paper offers an innovative prospective on how two novel prominent approaches to ASR, namely speech attribute detection and discriminative training, can be combined into a unified framework with beneficial effects on the overall speech recognition performance. This goal is achieved by embedding phonetic feature detection into a penalized logistic regression machine (PLRM). The proposed approach is evaluated on both isolated and continuous phoneme recognition tasks. Experimental evidence indicate that the proposed framework is able to achieve state-of-the-art performance in the isolated speech recognition task and to outperform current technology in the continuous speech recognition task.