On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Activity monitoring: noticing interesting changes in behavior
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
ROCK: a robust clustering algorithm for categorical attributes
Information Systems
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Decision tree classification of spatial data streams using Peano Count Trees
Proceedings of the 2002 ACM symposium on Applied computing
COOLCAT: an entropy-based algorithm for categorical clustering
Proceedings of the eleventh international conference on Information and knowledge management
Constructing Boosting Algorithms from SVMs: An Application to One-Class Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Findout: finding outliers in very large datasets
Knowledge and Information Systems
Anomaly Detection over Noisy Data using Learned Probability Distributions
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Mining concept-drifting data streams using ensemble classifiers
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Novelty detection: a review—part 1: statistical approaches
Signal Processing
Novelty detection: a review—part 2: neural network based approaches
Signal Processing
Towards automatic extraction of event and place semantics from flickr tags
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Online classification of nonstationary data streams
Intelligent Data Analysis
ACM Computing Surveys (CSUR)
Event detection from flickr data through wavelet-based spatial analysis
Proceedings of the 18th ACM conference on Information and knowledge management
Learning similarity metrics for event identification in social media
Proceedings of the third ACM international conference on Web search and data mining
Scaling record linkage to non-uniform distributed class sizes
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Finding media illustrating events
Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Heterogeneous features and model selection for event-based media classification
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
Automated social event detection in large photo collections
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
EventEnricher: a novel way to collect media illustrating events
Proceedings of the 3rd ACM conference on International conference on multimedia retrieval
ReSEED: social event dEtection dataset
Proceedings of the 5th ACM Multimedia Systems Conference
Hi-index | 0.00 |
Events play a prominent role in our lives, such that many social media documents describe or are related to some event. Organizing social media documents with respect to events thus seems a promising approach to better manage and organize the ever-increasing amount of content in social media applications. A challenge is to automatize this process so that incoming documents can be assigned to their corresponding event without any user intervention. We present a system that is able to classify a stream of social media data into a growing and evolving set of events. By doing this, we successfully address two key problems that arise in this context: i) scaling to the data sizes and rates encountered in social media applications, and ii) tackling the new event detection problem, i.e. the problem of determining whether an incoming data item belongs to a new or a known event. We successfully address these problems by i) including a candidate retrieval step that retrieves a set of event candidates that the incoming data point is likely to belong to and ii) by including a function trained using machine learning techniques to determine whether the incoming data item belongs to the top scoring candidate or rather to a new event. We show that our system addresses the above mentioned challenging issues successfully and that it outperforms other state-of-the-art approaches in terms of quality and scalability.