Enhancing spontaneous speech recognition with BLSTM features

  • Authors:
  • Martin Wöllmer;Björn Schuller

  • Affiliations:
  • Institute for Human-Machine Communication, Technische Universität München, München, Germany;Institute for Human-Machine Communication, Technische Universität München, München, Germany

  • Venue:
  • NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a novel context-sensitive feature extraction approach for spontaneous speech recognition. As bidirectional Long Short-Term Memory (BLSTM) networks are known to enable improved phoneme recognition accuracies by incorporating long-range contextual information into speech decoding, we integrate the BLSTM principle into a Tandem front-end for probabilistic feature extraction. Unlike previously proposed approaches which exploit BLSTM modeling by generating a discrete phoneme prediction feature, our feature extractor merges continuous high-level probabilistic BLSTM features with low-level features. Evaluations on challenging spontaneous, conversational speech recognition tasks show that this concept prevails over recently published architectures for feature-level context modeling.