On-line learning and stochastic approximations
On-line learning in neural networks
Scheduling Strategy to improve Response Time for Web Applications
HPCN Europe 1998 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Understanding user goals in web search
Proceedings of the 13th international conference on World Wide Web
The author-topic model for authors and documents
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Evaluating implicit measures to improve web search
ACM Transactions on Information Systems (TOIS)
Topics over time: a non-Markov continuous-time model of topical trends
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining geographic knowledge using location aware topic model
Proceedings of the 4th ACM workshop on Geographical information retrieval
Fast collapsed gibbs sampling for latent dirichlet allocation
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs
Proceedings of the 17th ACM conference on Information and knowledge management
Proceedings of the 18th international conference on World wide web
Efficient methods for topic model inference on streaming document collections
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Analyzing and evaluating query reformulation strategies in web search logs
Proceedings of the 18th ACM conference on Information and knowledge management
Inferring search behaviors using partially observable Markov (POM) model
Proceedings of the third ACM international conference on Web search and data mining
GeoFolk: latent spatial semantics in web 2.0 social media
Proceedings of the third ACM international conference on Web search and data mining
Distributed Algorithms for Topic Models
The Journal of Machine Learning Research
Towards query log based personalization using topic models
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Multidimensional mining of large-scale search logs: a topic-concept cube approach
Proceedings of the fourth ACM international conference on Web search and data mining
Bridging topic modeling and personalized search
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Sparse hidden-dynamics conditional random fields for user intent understanding
Proceedings of the 20th international conference on World wide web
Inferring parameters and structure of latent variable models by variational bayes
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Loopy belief propagation for approximate inference: an empirical study
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Approximating Data with the Count-Min Sketch
IEEE Software
Scalable inference in latent variable models
Proceedings of the fifth ACM international conference on Web search and data mining
Discovering geographical topics in the twitter stream
Proceedings of the 21st international conference on World Wide Web
Mining entity types from query logs via user intent modeling
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
G-WSTD: a framework for geographic web search topic discovery
Proceedings of the 21st ACM international conference on Information and knowledge management
(big) usage data in web search
Proceedings of the sixth ACM international conference on Web search and data mining
Learning Topic Models by Belief Propagation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
Web search involves voluminous data streams that record millions of users' interactions with the search engine. Recently latent topics in web search data have been found to be critical for a wide range of search engine applications such as search personalization and search history warehousing. However, the existing methods usually discover latent topics from web search data in an offline and retrospective fashion. Hence, they are increasingly ineffective in the face of the ever-increasing web search data that accumulate in the format of online streams. In this paper, we propose a novel probabilistic topic model, the Web Search Stream Model (WSSM), which is delicately calibrated for handling two salient features of the web search data: it is in the format of streams and in massive volume. We further propose an efficient parameter inference method, the Stream Parameter Inference (SPI) to efficiently train WSSM with massive web search streams. Based on a large-scale search engine query log, we conduct extensive experiments to verify the effectiveness and efficiency of WSSM and SPI. We observe that WSSM together with SPI discovers latent topics from web search streams faster than the state-of-the-art methods while retaining a comparable topic modeling accuracy.