Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Bursty and hierarchical structure in streams
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
On the bursty evolution of blogspace
WWW '03 Proceedings of the 12th international conference on World Wide Web
Efficient elastic burst detection in data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatically collecting, monitoring, and mining japanese weblogs
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Parameter free bursty events detection in text streams
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Detecting Lasting and Abrupt Bursts in Data Streams Using Two-Layered Wavelet Tree
AICT-ICIW '06 Proceedings of the Advanced Int'l Conference on Telecommunications and Int'l Conference on Internet and Web Applications and Services
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Adaptively detecting aggregation bursts in data streams
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Opt-in detection based on call detail records
CCNC'09 Proceedings of the 6th IEEE Conference on Consumer Communications and Networking Conference
Finding critical thresholds for defining bursts
DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
Hi-index | 0.01 |
Burst detection is an inherent problem for data streams and it has attracted extensive attention in research community due to its broad applications. In this paper, an integrated approach is introduced to solve burst events detection problem over high speed short text streams. First, we simplify the requirement by considering burst event as a set of burst features, then the processing speed can be accelerated and multiple features can be identified simultaneously. Second, by using the ratio of the number of documents with specific feature and total number of documents during a period of time as the measurement, our solution adapts to any kind of data distribution. Then we propose two algorithms to maintain the ratio in the sliding window. Finally, we propose a burst detection algorithm based on Ratio Aggregation Pyramid (RAP) and Slope Pyramid (SP) data structure, which are extended from Aggregation Pyramid (AP). Our algorithm can detect burst in multiple window sizes simultaneously and is parameter-free. Theoretical analysis and experimental results verify the availability, efficiency and scalability of our method.