Hardware speech recognition for user interfaces in low cost, low power devices

Authors:
Sergiu Nedevschi;Rabin K. Patra;Eric A. Brewer
Affiliations:
University of California at Berkeley;University of California at Berkeley;University of California at Berkeley
Venue:
Proceedings of the 42nd annual Design Automation Conference
Year:
2005

Citing 7
Cited 9

A hardware accelerator for speech recognition algorithms

ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
A static power model for architects

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
JouleTrack: a web based tool for software energy profiling

Proceedings of the 38th annual Design Automation Conference
Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models

FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Analysis of the Tradeoffs for the Implementation of a High-Radix Logarithm

ICCD '02 Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02)
A low-power accelerator for the SPHINX 3 speech recognition system

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Architectural optimizations for low-power, real-time speech recognition

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems

The Case for Technology in Developing Regions

Computer
Tamil market: a spoken dialog system for rural India

CHI '06 Extended Abstracts on Human Factors in Computing Systems
A 1000-word vocabulary, speaker-independent, continuous live-mode speech recognizer implemented in a single FPGA

Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays
Speech silicon: an FPGA architecture for real-time hidden Markov-model-based speech recognition

EURASIP Journal on Embedded Systems
TinyPC: enabling low-cost internet access in developing regions

Proceedings of the 2007 workshop on Networked systems for developing regions
A multi-fpga 10x-real-time high-speed search engine for a 5000-word vocabulary speech recognizer

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
A real-time FPGA-based 20 000-word speech recognizer with optimized DRAM access

IEEE Transactions on Circuits and Systems Part I: Regular Papers
Memory Access Optimized VLSI for 5000-Word Continuous Speech Recognition

Journal of Signal Processing Systems
Real-Time Speaker Verification System Implemented on Reconfigurable Hardware

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a system architecture for real-time hardware speech recognition on low-cost, power-constrained devices. The system is intended to support real-time speech-based user interfaces as part of an effort to bring Information and Communication Technologies (ICTs) to underdeveloped regions of the world.Our system architecture exploits a shared infrastructure model. The computationally intensive task of speech model training and retraining is performed offline by shared servers, while the actual recognition of speech is conducted on low-cost hand-held devices using custom hardware.The recognizer is extremely flexible and can support multiple languages or dialects with speaker-independent recognition. Dynamic loading of speech models is used for changing language grammar and retraining, while reprogramming is used to support evolution of recognition algorithms. The focus on small sets of words (at one time) reduces the complexity, cost and power consumption. We design the speech decoder, the central component of the recognizer, and we validate it via a prototype FPGA implementation. We then use ASIC synthesis to estimate power and size for the design.Our evaluations demonstrate an order of magnitude improvement in power compared with optimized recognition software running on a low-power embedded general-purpose processor of the same technology and of similar capabilities. The synthesis also estimates the area of the design to be about 2.5mm2, showing potential for lower cost. In designing and testing our recognizer we use datasets in both English and Tamil languages.