Online Burst Detection Over High Speed Short Text Streams

  • Authors:
  • Zhijian Yuan;Yan Jia;Shuqiang Yang

  • Affiliations:
  • Computer School, National University of Defense Technology, Changsha, China;Computer School, National University of Defense Technology, Changsha, China;Computer School, National University of Defense Technology, Changsha, China

  • Venue:
  • ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

Burst detection is an inherent problem for data streams and it has attracted extensive attention in research community due to its broad applications. In this paper, an integrated approach is introduced to solve burst events detection problem over high speed short text streams. First, we simplify the requirement by considering burst event as a set of burst features, then the processing speed can be accelerated and multiple features can be identified simultaneously. Second, by using the ratio of the number of documents with specific feature and total number of documents during a period of time as the measurement, our solution adapts to any kind of data distribution. Then we propose two algorithms to maintain the ratio in the sliding window. Finally, we propose a burst detection algorithm based on Ratio Aggregation Pyramid (RAP) and Slope Pyramid (SP) data structure, which are extended from Aggregation Pyramid (AP). Our algorithm can detect burst in multiple window sizes simultaneously and is parameter-free. Theoretical analysis and experimental results verify the availability, efficiency and scalability of our method.