Real-Time Recognition of Spoken Words

  • Authors:
  • L. C. W. Pols

  • Affiliations:
  • -

  • Venue:
  • IEEE Transactions on Computers
  • Year:
  • 1971

Quantified Score

Hi-index 14.98

Visualization

Abstract

First a survey is given of a number of published vowel and word recognition systems. Then a new real-time word recognition system is described that uses only a small computer (8K memory) and a few analog peripherals. The essentials of the procedure are as follows. During the pronunciation of a word, a spectral analysis is carried out by a bank of 17 1/3-octave bandpass filters. The outputs of the filters are logarithmically amplified and the maximal amplitude of the envelope is determined and sampled every 15 ms. In this way a word is characterized by a sequence of sample points in a 17-dimensional space. Then a principal components analysis is performed, reducing the original 17 dimensions of the space to 3. After a linear time normalization, the 3-dimensional trace of the spoken word is compared with 20 reference traces, representing the 20 possible utterances (the digits, plus 10 computer commands). The machine responds by naming the best fitting trace. With the 20 speakers of the design set, the machine is correct 98.8 percent of the time.