Incremental probabilistic latent semantic analysis for automatic question recommendation
Proceedings of the 2008 ACM conference on Recommender systems
Online New Event Detection Based on IPLSA
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Minimum rank error language modeling
IEEE Transactions on Audio, Speech, and Language Processing
Recursive Bayesian linear regression for adaptive classification
IEEE Transactions on Signal Processing
Incremental Learning of Triadic PLSA for Collaborative Filtering
AMT '09 Proceedings of the 5th International Conference on Active Media Technology
On smoothing and inference for topic models
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
S-PLASA+: adaptive sentiment analysis with application to sales performance prediction
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Online learning for PLSA-based visual recognition
ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part II
Clickthrough-based latent semantic models for web search
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Continuous improvement of knowledge management systems using Six Sigma methodology
Robotics and Computer-Integrated Manufacturing
Hi-index | 0.00 |
Due to the vast growth of data collections, the statistical document modeling has become increasingly important in language processing areas. Probabilistic latent semantic analysis (PLSA) is a popular approach whereby the semantics and statistics can be effectively captured for modeling. However, PLSA is highly sensitive to task domain, which is continuously changing in real-world documents. In this paper, a novel Bayesian PLSA framework is presented. We focus on exploiting the incremental learning algorithm for solving the updating problem of new domain articles. This algorithm is developed to improve document modeling by incrementally extracting up-to-date latent semantic information to match the changing domains at run time. By adequately representing the priors of PLSA parameters using Dirichlet densities, the posterior densities belong to the same distribution so that a reproducible prior/posterior mechanism is activated for incremental learning from constantly accumulated documents. An incremental PLSA algorithm is constructed to accomplish the parameter estimation as well as the hyperparameter updating. Compared to standard PLSA using maximum likelihood estimate, the proposed approach is capable of performing dynamic document indexing and modeling. We also present the maximum a posteriori PLSA for corrective training. Experiments on information retrieval and document categorization demonstrate the superiority of using Bayesian PLSA methods.