The Journal of Machine Learning Research
ICML '06 Proceedings of the 23rd international conference on Machine learning
Structured correspondence topic models for mining captioned figures in biological literature
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
An architecture for parallel topic models
Proceedings of the VLDB Endowment
Scalable clustering of news search results
Proceedings of the fourth ACM international conference on Web search and data mining
Unified analysis of streaming news
Proceedings of the 20th international conference on World wide web
Scalable distributed inference of dynamic user interests for behavioral targeting
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable inference in latent variable models
Proceedings of the fifth ACM international conference on Web search and data mining
Modeling content and users: structured probabilistic representation and scalable online inference algorithms
Hi-index | 0.00 |
Online content have become an important medium to disseminate information and express opinions. With their proliferation, users are faced with the problem of missing the big picture in a sea of irrelevant and/or diverse content. In this paper, we addresses the problem of information organization of online document collections, and provide algorithms that create a structured representation of the otherwise unstructured content. We leverage the expressiveness of latent probabilistic models (e.g., topic models) and non-parametric Bayes techniques (e.g., Dirichlet processes), and give online and distributed inference algorithms that scale to terabyte datasets and adapt the inferred representation with the arrival of new documents. This paper is an extended abstract of the 2012 ACM SIGKDD best doctoral dissertation award of Ahmed [2011].