MIST: distributed indexing and querying in sensor networks using statistical models

Authors:
Arnab Bhattacharya;Anand Meka;Ambuj K. Singh
Affiliations:
University of California, Santa Barbara, CA;University of California, Santa Barbara, CA;University of California, Santa Barbara, CA
Venue:
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Year:
2007

Citing 18
Cited 2

Concrete mathematics: a foundation for computer science

Concrete mathematics: a foundation for computer science
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Techniques for reducing consistency-related communication in distributed shared-memory systems

ACM Transactions on Computer Systems (TOCS)
Mixed Memory Markov Models: Decomposing Complex Stochastic Processes as Mixtures of Simpler Ones

Machine Learning
Adaptive precision setting for cached approximate values

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Energy-efficient computing for wildlife tracking: design tradeoffs and early experiences with ZebraNet

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Adaptive stream resource management using Kalman Filters

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Queueing Networks and Markov Chains

Queueing Networks and Markov Chains
Exploiting Correlated Attributes in Acquisitional Query Processing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Audio-Visual Affect Recognition through Multi-Stream Fused HMM for HCI

CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2 - Volume 02
Approximate Data Collection in Sensor Networks using Probabilistic Models

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Energy-efficient monitoring of extreme values in sensor networks

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
An annotation method for sensor data streams based on statistical patterns

DBA'06 Proceedings of the 24th IASTED international conference on Database and applications
Modeling spatially correlated data in sensor networks

ACM Transactions on Sensor Networks (TOSN)
Model-driven data acquisition in sensor networks

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Finding temporal patterns by data decomposition

FGR' 04 Proceedings of the Sixth IEEE international conference on Automatic face and gesture recognition

Data caching-based query processing in multi-sink wireless sensor networks

International Journal of Sensor Networks
Effectively modeling data from large-area community sensor networks

Proceedings of the 11th international conference on Information Processing in Sensor Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The modeling of high level semantic events from low level sensor signals is important in order to understand distributed phenomena. For such content-modeling purposes, transformation of numeric data into symbols and the modeling of resulting symbolic sequences can be achieved using statistical models---Markov Chains (MCs) and Hidden Markov Models (HMMs). We consider the problem of distributed indexing and semantic querying over such sensor models. Specifically, we are interested in efficiently answering (i) range queries: return all sensors that have observed an unusual sequence of symbols with a high likelihood, (ii) top-1 queries: return the sensor that has the maximum probability of observing a given sequence, and (iii) 1-NN queries: return the sensor (model) which is most similar to a query model. All the above queries can be answered at the centralized base station, if each sensor transmits its model to the base station. However, this is communication-intensive. We present a much more efficient alternative---a distributed index structure, MIST (Model-based Index STructure), and accompanying algorithms for answering the above queries. MIST aggregates two or more constituent models into a single composite model, and constructs an in-network hierarchy over such composite models. We develop two kinds of composite models: the first kind captures the average behavior of the underlying models and the second kind captures the extreme behaviors of the underlying models. Using the index parameters maintained at the root of a subtree, we bound the probability of observation of a query sequence from a sensor in the subtree. We also bound the distance of a query model to a sensor model using these parameters. Extensive experimental evaluation on both real-world and synthetic data sets show that the MIST schemes scale well in terms of network size and number of model states. We also show its superior performance over the centralized schemes in terms of update, query, and total communication costs.