Posterior estimates and transforms for speech recognition

  • Authors:
  • Jan Zelinka;Luboý ýmídl;Jan Trmal;Luděk Müller

  • Affiliations:
  • The Department of Cybernetics, University of West Bohemia, Czech Republic;The Department of Cybernetics, University of West Bohemia, Czech Republic;The Department of Cybernetics, University of West Bohemia, Czech Republic;The Department of Cybernetics, University of West Bohemia, Czech Republic

  • Venue:
  • TSD'10 Proceedings of the 13th international conference on Text, speech and dialogue
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes ANN based posterior estimates and their application to speech recognition. We replaced the standard back-propagation with the L-BFGS quasi-Newton method. We have focused only on posterior based feature vector extraction. Our goal was a feature vector dimension reduction. Thus we designed three posterior transforms to space with dimensionality 1 or 2. The designed transforms were tested on the SpeechDat-East corpus. We also applied the introduced method on a Czech audio-visual corpus. In both cases the methods leads to significant word error rate decrease.