Speech silicon: an FPGA architecture for real-time hidden Markov-model-based speech recognition

Authors:
Jeffrey Schuster;Kshitij Gupta;Raymond Hoare;Alex K. Jones
Affiliations:
University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA
Venue:
EURASIP Journal on Embedded Systems
Year:
2006

Citing 6
Cited 5

A hardware accelerator for speech recognition algorithms

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Fundamentals of speech recognition

Fundamentals of speech recognition
Characterizing the SPHINX Speech Recognition System

Characterizing the SPHINX Speech Recognition System
A low-power accelerator for the SPHINX 3 speech recognition system

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Hardware speech recognition for user interfaces in low cost, low power devices

Proceedings of the 42nd annual Design Automation Conference
Feature Selection Methods for Hidden Markov Model-Based Speech Recognition

ICPR '96 Proceedings of the 13th International Conference on Pattern Recognition - Volume 2

High speed biological sequence analysis with hiddenMarkov models on reconfigurable platforms

IEEE Transactions on Information Technology in Biomedicine - Special section on computational intelligence in medical systems
An FPGA implementation of the local global graph-based voice biometric authentication scheme

DSP'09 Proceedings of the 16th international conference on Digital Signal Processing
A real-time FPGA-based 20 000-word speech recognizer with optimized DRAM access

IEEE Transactions on Circuits and Systems Part I: Regular Papers
Memory Access Optimized VLSI for 5000-Word Continuous Speech Recognition

Journal of Signal Processing Systems
Flexible and Expandable Speech Recognition Hardware with Weighted Finite State Transducers

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper examines the design of an FPGA-based system-on-a-chip capable of performing continuous speech recognition on medium-sized vocabularies in real time. Through the creation of three dedicated pipelines, one for each of the major operations in the system, we were able to maximize the throughput of the system while simultaneously minimizing the number of pipeline stalls in the system. Further, by implementing a token-passing scheme between the later stages of the system, the complexity of the control was greatly reduced and the amount of active data present in the system at any time was minimized. Additionally, through in-depth analysis of the SPHINX 3 large vocabulary continuous speech recognition engine, we were able to design models that could be efficiently benchmarked against a known software platform. These results, combined with the ability to reprogram the system for different recognition tasks, serve to create a system capable of performing real-time speech recognition in a vast array of environments.