Fusion coding of correlated sources for storage and selective retrieval

Authors:
Sharadh Ramaswamy;Jayant Nayak;Kenneth Rose
Affiliations:
Mayachitra Inc., Santa Barbara, CA;Mayachitra Inc., Santa Barbara, CA;Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA
Venue:
IEEE Transactions on Signal Processing
Year:
2010

Citing 7
Cited 0

Vector quantization and signal compression

Vector quantization and signal compression
Elements of information theory

Elements of information theory
Induction of Decision Trees

Machine Learning
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Distributed Source Coding Using Syndromes (DISCUS): Design and Construction

DCC '99 Proceedings of the Conference on Data Compression
On the optimal density for real-time data gathering of spatio-temporal processes in sensor networks

IPSN '05 Proceedings of the 4th international symposium on Information processing in sensor networks
Rate-distortion approach to databases: storage and content-based retrieval

IEEE Transactions on Information Theory

Quantified Score

Hi-index	35.68

Visualization

Abstract

We focus on a new, potentially important application of source coding directed toward storage and retrieval, termed fusion coding of correlated sources. The task at hand is to efficiently store multiple correlated sources in a database so that, at any point of time in the future, data from a selective subset of sources specified by user can be efficiently retrieved. Only statistical information about future queries is available in advance. A typical application scenario would be in storage of correlated data generated by dense sensor networks, where information from specific regions is requested in the future. We propose a fusion coder (FC) for lossy storage and retrieval, wherein different queries are handled by allowing for selective (compressed) bit retrieval. We derive the properties of an optimal FC and present an iterative algorithm for its design. Since iterative design is initialization-dependent, we present initialization heuristics that help avoid poor local optima. An analysis of design complexity reveals complexity growth with query-set size. We first tackle this problem by exploiting optimality properties of FCs. We also consider quantization of the query-space with decision trees in order to adapt to new queries, unseen during FC design. Experiments conducted on real and synthetic data-sets demonstrate that the proposed FC is able to achieve significantly better tradeoffs than joint compression by vector quantization (VQ), with retrieval speedups reaching 3 and distortion gains of up to 3.5 dB possible.