On the equivalence between nonnegative tensor factorization and tensorial probabilistic latent semantic analysis

  • Authors:
  • Wei Peng;Tao Li

  • Affiliations:
  • Xerox Innovation Group, Xerox Corporation, Rochester, USA 14580;School of Computer Science, Florida International University, Miami, USA 33199

  • Venue:
  • Applied Intelligence
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Non-negative Matrix Factorization (NMF) and Probabilistic Latent Semantic Analysis (PLSA) are two widely used methods for non-negative data decomposition of two-way data (e.g., document-term matrices). Studies have shown that PLSA and NMF (with the Kullback-Leibler divergence objective) are different algorithms optimizing the same objective function. Recently, analyzing multi-way data (i.e., tensors), has attracted a lot of attention as multi-way data have rich intrinsic structures and naturally appear in many real-world applications. In this paper, the relationships between NMF and PLSA extensions on multi-way data, e.g., NTF (Non-negative Tensor Factorization) and T-PLSA (Tensorial Probabilistic Latent Semantic Analysis), are studied. Two types of T-PLSA models are shown to be equivalent to two well-known non-negative factorization models: PARAFAC and Tucker3 (with the KL-divergence objective). NTF and T-PLSA are also compared empirically in terms of objective functions, decomposition results, clustering quality, and computation complexity on both synthetic and real-world datasets. Finally, we show that a hybrid method by running NTF and T-PLSA alternatively can successfully jump out of each other's local minima and thus be able to achieve better clustering performance.