A Speech Recognition IC Using Hidden Markov Models with Continuous Observation Densities

Authors:
Wei Han;Kwok-Wai Hon;Cheong-Fat Chan;Chiu-Sing Choy;Kong-Pang Pun
Affiliations:
Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, People's Republic of China;Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, People's Republic of China;Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, People's Republic of China;Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, People's Republic of China;Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
Venue:
Journal of VLSI Signal Processing Systems
Year:
2007

Citing 2
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition
Low Power VLSI Architecture of Viterbi Scorer for HMM-based Isolated Word Recognition

ISQED '02 Proceedings of the 3rd International Symposium on Quality Electronic Design

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the design of a speech recognition IC using hidden Markov models (HMMs) with continuous observation densities. Results of offline and live recognition tests are also given. Our design employs a table look-up method to simplify the computation and hence the architecture of the circuit. Currently each state of the HMMs is represented by a double-mixture Gaussian distribution. With minor modifications, the proposed architecture can be extended to implement a recognizer in which models with higher order multi-mixture Gaussian distribution are used for more precise acoustic modeling. The test chip is fabricated with a 0.35 μm CMOS technology. The maximum operating frequency is 62.5 MHz at 3.3 V. For a 50-word vocabulary, the estimated recognition time is about 0.16 s. Using noise-corrupted utterances, the recognition accuracy is 93.8% for isolated English digits. Such a performance is comparable to the software implementation with the same algorithm. Live recognition test was also run for a vocabulary of 11 Chinese words. The accuracy is 91.8% for five male and five female speakers.