GAMPS: compressing multi sensor data by grouping and amplitude scaling

Authors:
Sorabh Gandhi;Suman Nath;Subhash Suri;Jie Liu
Affiliations:
University of California, Santa Barbara, Santa Barbara, CA, USA;Microsoft Research, Redmond, WA, USA;University of California, Santa Barbara, Santa Barbara, CA, USA;Microsoft Research, Redmond, WA, USA
Venue:
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Year:
2009

Citing 22
Cited 4

Skip lists: a probabilistic alternative to balanced trees

Communications of the ACM
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Similarity-based queries for time series data

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficiently supporting ad hoc queries in large datasets of time sequences

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Fast time-series searching with scaling and shifting

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Locally adaptive dimensionality reduction for indexing large time series databases

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Querying Time Series Data Based on Similarity

IEEE Transactions on Knowledge and Data Engineering
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Estimating Rarity and Similarity over Data Stream Windows

ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Approximation algorithms for combinatorial problems

STOC '73 Proceedings of the fifth annual ACM symposium on Theory of computing
A symbolic representation of time series, with implications for streaming algorithms

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Distributed regression: an efficient framework for modeling sensor network data

Proceedings of the 3rd international symposium on Information processing in sensor networks
Compressing historical information in sensor networks

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Deterministic wavelet thresholding for maximum-error metrics

PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A Survey on Data Compression in Wireless Sensor Networks

ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II - Volume 02
Streaming pattern discovery in multiple time-series

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Indexing Multidimensional Time-Series

The VLDB Journal — The International Journal on Very Large Data Bases
Indexable PLA for efficient similarity search

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Mining approximate top-k subspace anomalies in multi-dimensional time-series data

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Shared Descriptions Fusion Coding for Storage and Selective Retrieval of Correlated Sources

DCC '08 Proceedings of the Data Compression Conference
Online maintenance of very large random samples on flash storage

Proceedings of the VLDB Endowment
RIDA: a robust information-driven data compression architecture for irregular wireless sensor networks

EWSN'07 Proceedings of the 4th European conference on Wireless sensor networks

Managing massive time series streams with multi-scale compressed trickles

Proceedings of the VLDB Endowment
Enabling ε-approximate querying in sensor networks

Proceedings of the VLDB Endowment
A classable indexing of data condensed semantically from physically massive data out of sensor networks on the rove

UCAmI'12 Proceedings of the 6th international conference on Ubiquitous Computing and Ambient Intelligence
On compressing data in wireless sensor networks for energy efficiency and real time delivery

Distributed and Parallel Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of collectively approximating a set of sensor signals using the least amount of space so that any individual signal can be efficiently reconstructed within a given maximum (L∞) error ε. The problem arises naturally in applications that need to collect large amounts of data from multiple concurrent sources, such as sensors, servers and network routers, and archive them over a long period of time for offline data mining. We present GAMPS, a general framework that addresses this problem by combining several novel techniques. First, it dynamically groups multiple signals together so that signals within each group are correlated and can be maximally compressed jointly. Second, it appropriately scales the amplitudes of different signals within a group and compresses them within the maximum allowed reconstruction error bound. Our schemes are polynomial time O(α, β approximation schemes, meaning that the maximum (L∞) error is at most α ε and it uses at most β times the optimal memory. Finally, GAMPS maintains an index so that various queries can be issued directly on compressed data. Our experiments on several real-world sensor datasets show that GAMPS significantly reduces space without compromising the quality of search and query.