Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
A low-power accelerator for the SPHINX 3 speech recognition system
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Architectural optimizations for low-power, real-time speech recognition
Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Hardware speech recognition for user interfaces in low cost, low power devices
Proceedings of the 42nd annual Design Automation Conference
Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays
Speech silicon: an FPGA architecture for real-time hidden Markov-model-based speech recognition
EURASIP Journal on Embedded Systems
A multi-fpga 10x-real-time high-speed search engine for a 5000-word vocabulary speech recognizer
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
VLSI for 5000-word continuous speech recognition
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Simulation-based word-length optimization method for fixed-pointdigital signal processing systems
IEEE Transactions on Signal Processing
Flexible and Expandable Speech Recognition Hardware with Weighted Finite State Transducers
Journal of Signal Processing Systems
Fast Likelihood Computation in Speech Recognition using Matrices
Journal of Signal Processing Systems
Hi-index | 0.00 |
A real-time hardware-based large vocabulary speech recognizer requires high memory bandwidth. We have developed a field-programmable-gate-array (FPGA)-based 20000-word speech recognizer utilizing efficient dynamic random access memory (DRAM) access. This system contains all the functional blocks for hidden-Markov-model-based speaker-independent continuous speech recognition: feature extraction, emission probability computation, and intraword and interword Viterbi beam search. The feature extraction is conducted in software on a soft-core-based CPU, while the other functional units are implemented using parallel and pipelined hardware blocks. In order to reduce the number of memory access operations, we used several techniques such as bitwidth reduction of the Gaussian parameters, multiframe computation of the emission probability, and two-stage language model pruning. We also employ a customized DRAM controller that supports various access patterns optimized for each functional unit of the speech recognizer. The speech recognition hardware was synthesized for the Virtex-4 FPGA, and it operates at 100 MHz. The experimental result on Nov 92 20 k test set shows that the developed system runs 1.52 and 1.39 times faster than real time using the bigram and trigram language models, respectively.