Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
An algorithm for a singly constrained class of quadratic programs subject to upper and lower bounds
Mathematical Programming: Series A and B
A study of retrospective and on-line event detection
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Topic Detection and Tracking: Event-Based Information Organization
Topic Detection and Tracking: Event-Based Information Organization
Document clustering based on non-negative matrix factorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
On an equivalence between PLSI and LDA
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
ICML '06 Proceedings of the 23rd international conference on Machine learning
Projected Gradient Methods for Nonnegative Matrix Factorization
Neural Computation
Using Incremental PLSI for Threshold-Resilient Online Event Analysis
IEEE Transactions on Knowledge and Data Engineering
Computational Statistics & Data Analysis
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Online Learning for Matrix Factorization and Sparse Coding
The Journal of Machine Learning Research
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation
Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing
Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing
TM-LDA: efficient online modeling of latent topic transitions in social media
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Emerging topic detection for organizations from microblogs
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Studying page life patterns in dynamical web
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Real time discussion retrieval from twitter
Proceedings of the 22nd international conference on World Wide Web companion
A time-based collective factorization for topic discovery and monitoring in news
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
As massive repositories of real-time human commentary, social media platforms have arguably evolved far beyond passive facilitation of online social interactions. Rapid analysis of information content in online social media streams (news articles, blogs,tweets etc.) is the need of the hour as it allows business and government bodies to understand public opinion about products and policies. In most of these settings, data points appear as a stream of high dimensional feature vectors. Guided by real-world industrial deployment scenarios, we revisit the problem of online learning of topics from streaming social media content. On one hand, the topics need to be dynamically adapted to the statistics of incoming datapoints, and on the other hand, early detection of rising new trends is important in many applications. We propose an online nonnegative matrix factorizations framework to capture the evolution and emergence of themes in unstructured text under a novel temporal regularization framework. We develop scalable optimization algorithms for our framework, propose a new set of evaluation metrics, and report promising empirical results on traditional TDT tasks as well as streaming Twitter data. Our system is able to rapidly capture emerging themes, track existing topics over time while maintaining temporal consistency and continuity in user views, and can be explicitly configured to bound the amount of information being presented to the user.