The Markov selection model for concurrent speech recognition

  • Authors:
  • Paris Smaragdis;Bhiksha Raj

  • Affiliations:
  • University of Illinois, Urbana-Champaign, IL, USA and Adobe Systems Inc., USA;Carnegie Mellon University, Pittsburgh, PA, USA

  • Venue:
  • Neurocomputing
  • Year:
  • 2012

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper we introduce a new Markov model that is capable of recognizing speech from recordings of simultaneously speaking a priori known speakers. This work is based on recent work on non-negative representations of spectrograms, which has been shown to be very effective in source separation problems. In this paper we extend these approaches to design a Markov selection model that is able to recognize sequences even when they are presented mixed together. We do so without the need to perform separation on the signals. Unlike factorial Markov models which have been used similarly in the past that feature state spaces that are exponential in the number of sources, this approach features a low computational complexity model with a state space that is linear in the number of sources. We demonstrate the use of this framework in recognizing speech from mixtures of known speakers.