A study of retrospective and on-line event detection
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and redundancy detection in adaptive filtering
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Learning Approaches for Detecting and Tracking News Events
IEEE Intelligent Systems
Incremental Clustering for Mining in a Data Warehousing Environment
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Bursty and hierarchical structure in streams
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Topic-conditioned novelty detection
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A System for new event detection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
Automatic retrieval and clustering of similar words
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Text classification and named entities for new event detection
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic model for retrospective news event detection
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Discovering evolutionary theme patterns from text: an exploration of temporal text mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Parameter free bursty events detection in text streams
VLDB '05 Proceedings of the 31st international conference on Very large data bases
SnoopIB: interval-based event specification and detection for active databases
Data & Knowledge Engineering
Combining data-driven systems for improving Named Entity Recognition
Data & Knowledge Engineering
The Google Similarity Distance
IEEE Transactions on Knowledge and Data Engineering
Analyzing feature trajectories for event detection
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
New event detection based on indexing-tree and named entity
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
SCAN: a structural clustering algorithm for networks
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to Information Retrieval
Introduction to Information Retrieval
Anomalous window discovery through scan statistics for linear intersecting paths (SSLIP)
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Scaling high-order character language models to gigabytes
Software '05 Proceedings of the Workshop on Software
DBpedia - A crystallization point for the Web of Data
Web Semantics: Science, Services and Agents on the World Wide Web
Earthquake shakes Twitter users: real-time event detection by social sensors
Proceedings of the 19th international conference on World wide web
Invited paper: Visualizing search results and document collections using topic maps
Web Semantics: Science, Services and Agents on the World Wide Web
Extracting hot spots of topics from time-stamped documents
Data & Knowledge Engineering
A Novel Approach for Event Detection by Mining Spatio-temporal Information on Microblogs
ASONAM '11 Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining
Ontologies are us: a unified model of social networks and semantics
ISWC'05 Proceedings of the 4th international conference on The Semantic Web
Enumerative Combinatorics: Volume 1
Enumerative Combinatorics: Volume 1
Social event detection using multimodal clustering and integrating supervisory signals
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Learning latent temporal structure for complex event detection
CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Hi-index | 0.00 |
The problem of identifying important online or real life events from large textual document streams that are freely available on the World Wide Web is increasingly gaining popularity, given the flourishing of the social web. An event triggers discussion and comments on the WWW, especially in the blogosphere and in microblogging services. Consequently, one should be able to identify the involved entities, topics, time, and location of events through the analysis of information publicly available on the web, create semantically rich representations of events, and then use this information to provide interesting results, or summarize news to users. In this paper, we define the concept of important event and propose an efficient methodology for performing event detection from large time-stamped web document streams. The methodology successfully integrates named entity recognition, dynamic topic map discovery, topic clustering, and peak detection techniques. In addition, we propose an efficient algorithm for detecting all important events from a document stream. We perform extensive evaluation of the proposed methodology and algorithm on a dataset of 7million blogposts, as well as through an international social event detection challenge. The results provide evidence that our approach: a) accurately detects important events, b) creates semantically rich representations of the detected events, c) can be adequately parameterized to correspond to different social perceptions of the event concept, and d) is suitable for online event detection on very large datasets. The expected complexity of the online facet of the proposed algorithm is linear with respect to the number of documents in the data stream.