Continuous speech recognition with sparse coding

Authors:
W. J. Smit;E. Barnard
Affiliations:
University of Pretoria, Pretoria 0001, South Africa;Meraka Institute, P.O. Box 395, Pretoria 0001, South Africa
Venue:
Computer Speech and Language
Year:
2009

Citing 15
Cited 3

Maximum likelihood estimation for multivariate mixture observations of Markov chins

IEEE Transactions on Information Theory
Fundamentals of speech recognition

Fundamentals of speech recognition
Should recognizers have ears?

Speech Communication - Special issue on robust speech recognition
Sparse coding in the primate cortex

The handbook of brain theory and neural networks
A note on the decomposition methods for support vector regression

Neural Computation
Dictionary learning algorithms for sparse representation

Neural Computation
Phoneme recognition using ICA-based feature extraction and transformation

Signal Processing
Nonnegative features of spectro-temporal sounds for classification

Pattern Recognition Letters
Efficient Coding of Time-Relative Structure Using Spikes

Neural Computation
A Spike-Train Probability Model

Neural Computation
Spotting Neural Spike Patterns Using an Adversary Background Model

Neural Computation
Working Set Selection Using Second Order Information for Training Support Vector Machines

The Journal of Machine Learning Research
Isolated word recognition with the liquid state machine: a case study

Information Processing Letters - Special issue on applications of spiking neural networks
Matching pursuits with time-frequency dictionaries

IEEE Transactions on Signal Processing
Sparse and shift-Invariant representations of music

IEEE Transactions on Audio, Speech, and Language Processing

On the recognition of cochlear implant-like spectrally reduced speech with MFCC and HMM-based ASR

IEEE Transactions on Audio, Speech, and Language Processing
The impact of accents on automatic recognition of South African English speech: a preliminary investigation

SAICSIT '10 Proceedings of the 2010 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists
Noise-robust speech recognition through auditory feature detection and spike sequence decoding

Neural Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sparse coding is an efficient way of coding information. In a sparse code most of the code elements are zero; very few are active. Sparse codes are intended to correspond to the spike trains with which biological neurons communicate. In this article, we show how sparse codes can be used to do continuous speech recognition. We use the TIDIGITS dataset to illustrate the process. First a waveform is transformed into a spectrogram, and a sparse code for the spectrogram is found by means of a linear generative model. The spike train is classified by making use of a spike train model and dynamic programming. It is computationally expensive to find a sparse code. We use an iterative subset selection algorithm with quadratic programming for this process. This algorithm finds a sparse code in reasonable time if the input is limited to a fairly coarse spectral resolution. At this resolution, our system achieves a word error rate of 19%, whereas a system based on Hidden Markov Models achieves a word error rate of 15% at the same resolution.