Sparse Multilayer Perceptron for Phoneme Recognition

  • Authors:
  • G. S.V.S. Sivaram;H. Hermansky

  • Affiliations:
  • ECE Dept., Johns Hopkins Univ., Baltimore, MD, USA;-

  • Venue:
  • IEEE Transactions on Audio, Speech, and Language Processing
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces the sparse multilayer perceptron (SMLP) which jointly learns a sparse feature representation and nonlinear classifier boundaries to optimally discriminate multiple output classes. SMLP learns the transformation from the inputs to the targets as in multilayer perceptron (MLP) while the outputs of one of the internal hidden layers is forced to be sparse. This is achieved by adding a sparse regularization term to the cross-entropy cost and updating the parameters of the network to minimize the joint cost. On the TIMIT phoneme recognition task, SMLP-based systems trained on individual speech recognition feature streams perform significantly better than the corresponding MLP-based systems. Phoneme error rate of 19.6% is achieved using the combination of SMLP-based systems, a relative improvement of 3.0% over the combination of MLP-based systems.