Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
A view of the EM algorithm that justifies incremental, sparse, and other variants
Learning in graphical models
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Latent semantic space: iterative scaling improves precision of inter-document similarity measurement
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
Problems of learning on manifolds
Problems of learning on manifolds
Locality preserving indexing for document representation
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Text classification with kernels on the multinomial manifold
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Document Clustering Using Locality Preserving Indexing
IEEE Transactions on Knowledge and Data Engineering
ICML '05 Proceedings of the 22nd international conference on Machine learning
Latent semantic analysis for multiple-type interrelated data objects
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples
The Journal of Machine Learning Research
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Probabilistic dyadic data analysis with local and global consistency
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Semi-supervised topic modeling for image annotation
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Discriminative topic modeling based on manifold learning
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Variational inference with graph regularization for image annotation
ACM Transactions on Intelligent Systems and Technology (TIST)
Probabilistic topic models with biased propagation on heterogeneous information networks
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Locally discriminative topic modeling
Pattern Recognition
ImpactWheel: Visual Analysis of the Impact of Online News
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Proceedings of the 20th ACM international conference on Information and knowledge management
Latent feature encoding using dyadic and relational data
Proceedings of the 20th ACM international conference on Information and knowledge management
Discriminative Topic Modeling Based on Manifold Learning
ACM Transactions on Knowledge Discovery from Data (TKDD)
Latent Community Topic Analysis: Integration of Community Discovery with Topic Modeling
ACM Transactions on Intelligent Systems and Technology (TIST)
The contextual focused topic model
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Combining implicit and explicit topic representations for result diversification
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
A hybrid semi-supervised topic model
IScIDE'11 Proceedings of the Second Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
Real-time helpfulness prediction based on voter opinions
Concurrency and Computation: Practice & Experience
Information Bottleneck with local consistency
PRICAI'12 Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence
Group sparse topical coding: from code to topic
Proceedings of the sixth ACM international conference on Web search and data mining
Term Weighting Schemes for Emerging Event Detection
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Modeling hidden topics with dual local consistency for image analysis
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part I
Co-regularized PLSA for multi-view clustering
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part II
A biterm topic model for short texts
Proceedings of the 22nd international conference on World Wide Web
TopRec: domain-specific recommendation through community topic mining in social network
Proceedings of the 22nd international conference on World Wide Web
Scientific articles recommendation
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Tag-weighted topic model for mining semi-structured documents
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Proceedings of the Fourth Symposium on Information and Communication Technology
A jointly distributed semi-supervised topic model
Neurocomputing
Hi-index | 0.00 |
Topic modeling has been a key problem for document analysis. One of the canonical approaches for topic modeling is Probabilistic Latent Semantic Indexing, which maximizes the joint probability of documents and terms in the corpus. The major disadvantage of PLSI is that it estimates the probability distribution of each document on the hidden topics independently and the number of parameters in the model grows linearly with the size of the corpus, which leads to serious problems with overfitting. Latent Dirichlet Allocation (LDA) is proposed to overcome this problem by treating the probability distribution of each document over topics as a hidden random variable. Both of these two methods discover the hidden topics in the Euclidean space. However, there is no convincing evidence that the document space is Euclidean, or flat. Therefore, it is more natural and reasonable to assume that the document space is a manifold, either linear or nonlinear. In this paper, we consider the problem of topic modeling on intrinsic document manifold. Specifically, we propose a novel algorithm called Laplacian Probabilistic Latent Semantic Indexing (LapPLSI) for topic modeling. LapPLSI models the document space as a submanifold embedded in the ambient space and directly performs the topic modeling on this document manifold in question. We compare the proposed LapPLSI approach with PLSI and LDA on three text data sets. Experimental results show that LapPLSI provides better representation in the sense of semantic structure.