On-line learning and stochastic approximations
On-line learning in neural networks
On the algorithmic implementation of multiclass kernel-based vector machines
The Journal of Machine Learning Research
The Journal of Machine Learning Research
The author-topic model for authors and documents
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
A Bayesian Hierarchical Model for Learning Natural Scene Categories
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Discovering Objects and their Localization in Images
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1 - Volume 01
Learning Object Categories from Google"s Image Search
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
ICML '06 Proceedings of the 23rd international conference on Machine learning
Using Multiple Segmentations to Discover Objects and their Extent in Image Collections
CVPR '06 Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Proceedings of the 24th international conference on Machine learning
Efficient projections onto the l1-ball for learning in high dimensions
Proceedings of the 25th international conference on Machine learning
Mixed Membership Stochastic Blockmodels
The Journal of Machine Learning Research
MedLDA: maximum margin supervised topic models for regression and classification
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Efficient methods for topic model inference on streaming document collections
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Exponential family sparse coding with applications to self-taught learning
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
On smoothing and inference for topic models
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Online multiscale dynamic topic models
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Regularized latent semantic indexing
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Mr. LDA: a flexible large scale topic modeling package using variational inference in MapReduce
Proceedings of the 21st international conference on World Wide Web
Towards compact topical descriptors
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Learning Semantic Motion Patterns for Dynamic Scenes by Improved Sparse Topical Coding
ICME '12 Proceedings of the 2012 IEEE International Conference on Multimedia and Expo
Multi-Level structured image coding on high-dimensional image representation
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part II
Hi-index | 0.00 |
Topic models have shown great promise in discovering latent semantic structures from complex data corpora, ranging from text documents and web news articles to images, videos, and even biological data. In order to deal with massive data collections and dynamic text streams, probabilistic online topic models such as online latent Dirichlet allocation (OLDA) have recently been developed. However, due to normalization constraints, OLDA can be ineffective in controlling the sparsity of discovered representations, a desirable property for learning interpretable semantic patterns, especially when the total number of topics is large. In contrast, sparse topical coding (STC) has been successfully introduced as a non-probabilistic topic model for effectively discovering sparse latent patterns by using sparsity-inducing regularization. But, unfortunately STC cannot scale to very large datasets or deal with online text streams, partly due to its batch learning procedure. In this paper, we present a sparse online topic model, which directly controls the sparsity of latent semantic patterns by imposing sparsity-inducing regularization and learns the topical dictionary by an online algorithm. The online algorithm is efficient and guaranteed to converge. Extensive empirical results of the sparse online topic model as well as its collapsed and supervised extensions on a large-scale Wikipedia dataset and the medium-sized 20Newsgroups dataset demonstrate appealing performance.