A priority random sampling algorithm for time-based sliding windows over weighted streaming data

  • Authors:
  • Zhang Longbo;Li Zhanhuai;Zhao Yiqiang;Yu Min;Zhang Yang

  • Affiliations:
  • Northwestern Polytechnical University, China and Shandong University of Technology, Zibo, Shandong, China;Northwestern Polytechnical University, China;Shandong University of Technology, Zibo, Shandong, China;Northwestern Polytechnical University, China;Northwest A&F University, Yangling, Shaanxi, China

  • Venue:
  • Proceedings of the 2007 ACM symposium on Applied computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces the problem of random sampling from time-based sliding windows over weighted streaming data and presents a priority random sampling (PRS) algorithm for this problem. The algorithm extends classic reservoir-sampling algorithm and weighted random sampling algorithm with a reservoir to deal with the expiration of data items from time-based sliding window, and can avoid drawbacks of classic reservoir-sampling algorithm and weighted sampling algorithm with a reservoir. In the new algorithm, a key is assigned for each data item in the time-based sliding window by compromising its weight and arrival time, and works even when the number of data items in a sliding window varies dynamically over time. The experiments show that PRS algorithm is somewhat superior to WRS algorithm.