Robust Speaker Modeling Based on Constrained Nonnegative Tensor Factorization

  • Authors:
  • Qiang Wu;Liqing Zhang;Guangchuan Shi

  • Affiliations:
  • Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 200240;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 200240;Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 200240

  • Venue:
  • ISNN '08 Proceedings of the 5th international symposium on Neural Networks: Advances in Neural Networks
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nonnegative tensor factorization is an extension of nonnegative matrix factorization(NMF) to a multilinear case, where nonnegative constraints are imposed on the PARAFAC/Tucker model. In this paper, to identify speaker from a noisy environment, we propose a new method based on PARAFAC model called constrained Nonnegative Tensor Factorization (cNTF). Speech signal is encoded as a general higher order tensor in order to learn the basis functions from multiple interrelated feature subspaces. We simulate a cochlear-like peripheral auditory stage which is motivated by the auditory perception mechanism of human being. A sparse speech feature representation is extracted by cNTF which is used for robust speaker modeling. Orthogonal and nonsmooth sparse control constraints are further imposed on the PARAFAC model in order to preserve the useful information of each feature subspace in the higher order tensor. Alternating projection algorithm is applied to obtain a stable solution. Experiments results demonstrate that our method can improve the recognition accuracy specifically in noise environment.