Fundamentals of speech recognition
Fundamentals of speech recognition
Towards increasing speech recognition error rates
Speech Communication
Statistical methods for speech recognition
Statistical methods for speech recognition
Independent component analysis: algorithms and applications
Neural Networks
Robust automatic speech recognition with missing and unreliable acoustic data
Speech Communication
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
Connectionist Speech Recognition: A Hybrid Approach
Connectionist Speech Recognition: A Hybrid Approach
A Survey of Longest Common Subsequence Algorithms
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Challenges in adopting speech recognition
Communications of the ACM - Multimodal interfaces that flex, adapt, and persist
Efficient Coding of Time-Relative Structure Using Spikes
Neural Computation
Sparse spectrotemporal coding of sounds
EURASIP Journal on Applied Signal Processing
Continuous speech recognition with sparse coding
Computer Speech and Language
Isolated word recognition with the Liquid State Machine: a case study
Information Processing Letters - Special issue on applications of spiking neural networks
Auditory cortical representations of speech signals for phoneme classification
MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Noise adaptive training for robust automatic speech recognition
IEEE Transactions on Audio, Speech, and Language Processing
MVA Processing of Speech Features
IEEE Transactions on Audio, Speech, and Language Processing
Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations
IEEE Transactions on Audio, Speech, and Language Processing
Template-Based Continuous Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
Acoustic Modeling Using Deep Belief Networks
IEEE Transactions on Audio, Speech, and Language Processing
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
IEEE Transactions on Audio, Speech, and Language Processing
The PASCAL CHiME speech separation and recognition challenge
Computer Speech and Language
Hi-index | 0.00 |
Speech recognition in noisy conditions is a major challenge for computer systems, but the human brain performs it routinely and accurately. Automatic speech recognition ASR systems that are inspired by neuroscience can potentially bridge the performance gap between humans and machines. We present a system for noise-robust isolated word recognition that works by decoding sequences of spikes from a population of simulated auditory feature-detecting neurons. Each neuron is trained to respond selectively to a brief spectrotemporal pattern, or feature, drawn from the simulated auditory nerve response to speech. The neural population conveys the time-dependent structure of a sound by its sequence of spikes. We compare two methods for decoding the spike sequences-one using a hidden Markov model-based recognizer, the other using a novel template-based recognition scheme. In the latter case, words are recognized by comparing their spike sequences to template sequences obtained from clean training data, using a similarity measure based on the length of the longest common sub-sequence. Using isolated spoken digits from the AURORA-2 database, we show that our combined system outperforms a state-of-the-art robust speech recognizer at low signal-to-noise ratios. Both the spike-based encoding scheme and the template-based decoding offer gains in noise robustness over traditional speech recognition methods. Our system highlights potential advantages of spike-based acoustic coding and provides a biologically motivated framework for robust ASR development.