A statistical approach to machine translation
Computational Linguistics
Elements of information theory
Elements of information theory
A maximum entropy approach to natural language processing
Computational Linguistics
Speech recognition by machines and humans
Speech Communication
MMIE training of large vocabulary recognition systems
Speech Communication
An Introduction to Variational Methods for Graphical Models
Machine Learning
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Remap: recursive estimation and maximization of a posteriori probabilities in transition-based speech recognition
Non-negative Matrix Factorization with Sparseness Constraints
The Journal of Machine Learning Research
A fast learning algorithm for deep belief nets
Neural Computation
An Alphanet approach to optimising input transformations for continuous speech recognition
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
A vector Taylor series approach for environment-independent speech recognition
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 02
Maximum likelihood discriminant feature spaces
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
A unified architecture for natural language processing: deep neural networks with multitask learning
Proceedings of the 25th international conference on Machine learning
Extracting and composing robust features with denoising autoencoders
Proceedings of the 25th international conference on Machine learning
Graphical Models, Exponential Families, and Variational Inference
Foundations and Trends® in Machine Learning
Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Single-channel speech separation and recognition using loopy belief propagation
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Monaural speech separation and recognition challenge
Computer Speech and Language
Super-human multi-talker speech recognition: A graphical modeling approach
Computer Speech and Language
Vector quantization for the efficient computation of continuous density likelihoods
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
Image classification by a two-dimensional hidden Markov model
IEEE Transactions on Signal Processing
Template-Based Continuous Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Factor graphs and the sum-product algorithm
IEEE Transactions on Information Theory
Hi-index | 0.00 |
One of the earliest successful applications of machine-learning techniques to pattern recognition was the application of information-theoretic principles to speech recognition. Previous approaches relied heavily on expert input through the painstaking analysis of data to relate speech signals to the word sequences that produced them. Such methodologies were completely displaced by casting the speech recognition problem in a probabilistic framework by modeling the joint probability distribution of speech signals and word sequences. At the beginning of the 21st century, the amount of data and computation to train and build models has increased exponentially, and the emergence of new machine-learning algorithms and methodologies has opened new vistas in approaching complex pattern recognition problems. This is enabled by a new set of machine-learning techniques referred to as graphical models, with computationally tractable training algorithms. Closely related are neural-network modeling techniques, and there has been a resurgence of interest in the application of neural-network concepts, such as deep networks to speech recognition. The explosion of data has caused the development of new ways to capture the key features in massive amounts of data using efficient methods deploying exemplar-based sparse representations. Lastly, all of these different approaches can be tied together in a principled fashion using another variation of graphical models: an exponential model framework. This paper describes the current state of the art in speech recognition systems and highlights the developments that are expected to produce major breakthroughs in our ability to automatically recognize speech using computers.