RPLSA: A novel updating scheme for Probabilistic Latent Semantic Analysis

  • Authors:
  • N. Bassiou;C. Kotropoulos

  • Affiliations:
  • -;-

  • Venue:
  • Computer Speech and Language
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

A novel updating method for Probabilistic Latent Semantic Analysis (PLSA), called Recursive PLSA (RPLSA), is proposed. The updating of conditional probabilities is derived from first principles for both the asymmetric and the symmetric PLSA formulations. The performance of RPLSA for both formulations is compared to that of the PLSA folding-in, the PLSA rerun from the breakpoint, and well-known LSA updating methods, such as the singular value decomposition (SVD) folding-in and the SVD-updating. The experimental results demonstrate that the RPLSA outperforms the other updating methods under study with respect to the maximization of the average log-likelihood and the minimization of the average absolute error between the probabilities estimated by the updating methods and those derived by applying the non-adaptive PLSA from scratch. A comparison in terms of CPU run time is conducted as well. Finally, in document clustering using the Adjusted Rand index, it is demonstrated that the clusters generated by the RPLSA are: (a) similar to those generated by the PLSA applied from scratch; (b) closer to the ground truth than those created by the other PLSA or LSA updating methods.