SNIF TOOL: sniffing for patterns in continuous streams

Authors:
Abhishek Mukherji;Elke A. Rundensteiner;David C. Brown;Venkatesh Raghavan
Affiliations:
Worcester Polytechnic Institute, Worcester, MA;Worcester Polytechnic Institute, Worcester, MA;Worcester Polytechnic Institute, Worcester, MA;Worcester Polytechnic Institute, Worcester, MA
Venue:
Proceedings of the 17th ACM conference on Information and knowledge management
Year:
2008

Citing 19
Cited 1

Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Recursive hashing functions for n-grams

ACM Transactions on Information Systems (TOIS)
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continually evaluating similarity-based pattern queries on a streaming time series

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Evaluating continuous nearest neighbor queries for streaming time series via pre-fetching

Proceedings of the eleventh international conference on Information and knowledge management
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Scaling up Dynamic Time Warping to Massive Dataset

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
Supporting Content-Based Searches on Time Series via Approximation

SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
Efficient processing of subsequence matching with the Euclidean metric in time-series databases

Information Processing Letters
Subsequence matching on structured time series data

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
n-gram/2L: a space and time efficient two-level n-gram inverted index structure

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Adaptive similarity search in streaming time series with sliding windows

Data & Knowledge Engineering
LearnMet: learning domain-specific distance metrics for plots of scientific functions

Multimedia Tools and Applications
On the marriage of Lp-norms and edit distance

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
CAPE: continuous query engine with heterogeneous-grained adaptivity

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Ranked subsequence matching in time-series databases

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Distributed pattern discovery in multiple streams

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Towards vulnerability-based intrusion detection with event processing

Proceedings of the 5th ACM international conference on Distributed event-based system

Quantified Score

Hi-index	0.00

Visualization

Abstract

Continuous time-series sequence matching, specifically, matching a numeric live stream against a set of redefined pattern sequences, is critical for domains ranging from fire spread tracking to network traffic monitoring. While several algorithms exist for similarity matching of static time-series data, matching continuous data poses new, largely unsolved challenges including online real-time processing requirements and system resource limitations for handling infinite streams. In this work, we propose a novel live stream matching framework, called n-Snippet Indices Framework (in short, SNIF), to tackle these challenges. SNIF employs snippets as the basic unit for matching streaming time-series. The insight is to perform the matching at two levels of granularity: bag matching of subsets of snippets of the live stream against prefixes of the patterns, and order checking for maintaining successive candidate snippet bag matches. We design a two-level index structure, called SNIF index, which supports these two modes of matching. We propose a family of online two-level prefix matching algorithms that trade off between result accuracy and response time. The effectiveness of SNIF to detect patterns has been thoroughly tested through experiments using real datasets from the domains of fire monitoring and sensor motes. In this paper, we also present a study of SNIF's performance, accuracy and tolerance to noise compared against those of the state-of-the-art Continuous Query with Prediction (CQP) approach.