An on-line NMF model for temporal pattern learning: theory with application to automatic speech recognition

  • Authors:
  • Hugo Van Hamme

  • Affiliations:
  • Department ESAT, University of Leuven, Leuven, Belgium

  • Venue:
  • LVA/ICA'12 Proceedings of the 10th international conference on Latent Variable Analysis and Signal Separation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Convolutional non-negative matrix factorization (CNMF) can be used to discover recurring temporal (sequential) patterns in sequential vector non-negative data such as spectrograms or posteriorgrams. Drawbacks of this approach are the rigidity of the patterns and that it is intrinsically a batch method. However, in speech processing, like in many other applications, the patterns show a great deal of time warping variation and recognition should be on-line (possibly with some processing delay). Therefore, time-coded NMF (TC-NMF) is proposed as an alternative to CNMF to locate temporal patterns in time. TC-NMF is motivated by findings in neuroscience. The sequential data are first processed by a bank of filters such as leaky integrators with different time constants. The responses of these filters are modeled jointly by a constrained NMF. Algorithms for learning, decoding and locating patterns in time are proposed and verified with preliminary ASR experiments.