Elliptical basis functions for segment modeling

  • Authors:
  • G. Zavaliagkos;R. Schwartz;J. Makhoul

  • Affiliations:
  • Northeastern University, Boston, MA;BBN Systems and Technologies, Cambridge, MA;BBN Systems and Technologies, Cambridge, MA

  • Venue:
  • ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: plenary, special, audio, underwater acoustics, VLSI, neural networks - Volume I
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

Until recently, state-of-the-art, large-vocabulary, continuous speech recognition (CSR) has employed Hidden Markov Modeling (HMM) to model speech sounds. However, the limitations of HMMs in modeling dependency across phonetic segments have been known for some time. Last year, we presented the concept of a "Segmental Neural Net" (SNN) for phonetic modeling in continuous speech recognition (CSR) and demonstrated that a feed-forward neural network, used within a hybrid SNN/HMM system, is able to reduce by 20% the word error rate over the baseline HMM system. In this paper we describe two developments over the initial system. First, we present a novel way to generate fixed length segment representations based on the Discrete Cosine Transform (DCT). Second, we demonstrate that an Elliptical Basis Function (EBF) Network can be used in the same hybrid framework.