Data warehouse technology by infobright
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
MaD-WiSe: a distributed stream management system for wireless sensor networks
Software—Practice & Experience
TACO: tunable approximate computation of outliers in wireless sensor networks
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Random hyperplane projection using derived dimensions
Proceedings of the Ninth ACM International Workshop on Data Engineering for Wireless and Mobile Access
PAO: power-efficient attribution of outliers in wireless sensor networks
Proceedings of the Seventh International Workshop on Data Management for Sensor Networks
Towards approximate SQL: infobright's approach
RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Collection trees for event-monitoring queries
Information Systems
Detecting proximity events in sensor networks
Information Systems
SeTraStream: semantic-aware trajectory construction over streaming movement data
SSTD'11 Proceedings of the 12th international conference on Advances in spatial and temporal databases
Distributed similarity estimation using derived dimensions
The VLDB Journal — The International Journal on Very Large Data Bases
In-network approximate computation of outliers with quality guarantees
Information Systems
Hi-index | 0.00 |
Recent work has demonstrated that readings provided by commodity sensor nodes are often of poor quality. In order to provide a valuable sensory infrastructure for monitoring applications, we first need to devise techniques that can withstand "dirty" and unreliable data during query processing. In this paper we present a novel aggregation framework that detects suspicious measurements by outlier nodes and refrains from incorporating such measurements in the computed aggregate values. We consider different definitions of an outlier node, based on the notion of a user-specified minimum support, and discuss techniques for properly routing messages in the networkin order to reduce the bandwidth consumption and the energy drain during the query evaluation. In our experiments using real and synthetic traces we demonstrate that: (i) a straightforward evaluation of a user aggregate query leads to practically meaningless results due to the existence of outliers; (ii) our techniques can detect and eliminate spurious readings without any application specific knowledge of what constitutes normal behavior; (iii) the identification of outliers, when performed inside the network, significantly reduces bandwidth and energy drain compared to alternative methods that centrally collect and analyze all sensory data; and (iv) we can significantly reduce the cost of the aggregation process by utilizing simple statistics on outlier nodes and reorganizing accordingly the collection tree.