Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
We describe in this paper, a system that groups, classifies and finds the latent semantic features in a database composed of a large number of documents. The database will be constantly growing as users who co-create it will be adding more and more new documents. Users require a system to provide them information, both about a specific document, and about the entire set of documents. This information includes statistical data about words in documents, information about aspects in which this words appears, classification, clustering, etc. To meet these expectations we propose using methods for searching for hidden patterns in multivariable data. We apply machine learning algorithms for data analysis, useful in identifying local patterns in multivariate data. We consider two different algorithms described in the literature (1) Probabilistic Latent Semantic Analysis Method [2] and (2) Nonnegative Matrix Factorization algorithm described in [4] and used in the text analysis system [1].