Incorporating quality aspects in sensor data streams
Proceedings of the ACM first Ph.D. workshop in CIKM
Representing Data Quality in Sensor Data Streaming Environments
Journal of Data and Information Quality (JDIQ)
Stratified reservoir sampling over heterogeneous data streams
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Dynamic QoS-aware event sampling for community-based participatory sensing systems
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Efficient event detection by exploiting crowds
Proceedings of the 7th ACM international conference on Distributed event-based systems
Learning from data streams with only positive and unlabeled data
Journal of Intelligent Information Systems
A survey on concept drift adaptation
ACM Computing Surveys (CSUR)
Adaptive stratified reservoir sampling over heterogeneous data streams
Information Systems
Hi-index | 0.00 |
Reservoir sampling is a well-known technique for sequential random sampling over data streams. Conventional reservoir sampling assumes a fixed-size reservoir. There are situations, however, in which it is necessary and/or advantageous to adaptively adjust the size of a reservoir in the middle of sampling due to changes in data characteristics and/or application behavior. This paper studies adaptivesize reservoir sampling over data streams considering two main factors: reservoir size and sample uniformity. First, the paper conducts a theoretical study on the effects of adjusting the size of a reservoir while sampling is in progress. The theoretical results show that such an adjustment may bring a negative impact on the probability of the sample being uniform (called uniformity confidence herein). Second, the paper presents a novel algorithm for maintaining the reservoir sample after the reservoir size is adjusted such that the resulting uniformity confidence exceeds a given threshold. Third, the paper extends the proposed algorithm to an adaptive multi-reservoir sampling algorithm for a practical application in which samples are collected from memory-limited wireless sensor networks using a mobile sink. Finally, the paper empirically examines the adaptivity of the multi-reservoir sampling algorithm with regard to reservoir size and sample uniformity using real sensor networks data sets.