Matching and retrieving sequential patterns using regression

Authors:
Hansheng Lei;Venu Govindaraju
Affiliations:
CUBS, Center for Unified Biometrics and Sensors, State University of New York at Buffalo, Amherst, NY;CUBS, Center for Unified Biometrics and Sensors, State University of New York at Buffalo, Amherst, NY
Venue:
Web Intelligence and Agent Systems
Year:
2005

Citing 9
Cited 1

Matching and indexing sequences of different lengths

CIKM '97 Proceedings of the sixth international conference on Information and knowledge management
Fast time-series searching with scaling and shifting

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient Similarity Search In Sequence Databases

FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
On Similarity Queries for Time-Series Data: Constraint Specification and Implementation

CP '95 Proceedings of the First International Conference on Principles and Practice of Constraint Programming
An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases

Proceedings of the 17th International Conference on Data Engineering
StatStream: statistical monitoring of thousands of data streams in real time

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Exact indexing of dynamic time warping

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Web search engine working as a bee hive

Web Intelligence and Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequential pattern mining can prove to be very useful for predicating future activities, interpreting recurring phenomena, extracting similarities in a series of events, etc. For example, in the NASDAQ market, the problem of finding stocks whose closing prices are always a bout $β0 higher than or β1 times the stocks of a given company, reduces to linear pattern retrieval: given query X, find all sequences Y from the database S so that, Y = β0 + β1 X with confidence C.In this paper, we introduce a novel approach using the Simple Linear Regression (SLR) model to match and retrieve sequential patterns. We extend the one-dimensional R2 model to ER2 for multi-dimensional sequence matching. In addition, we present the SLR + FFT pruning technique to speed up data retrieval without incurring any false dismissal. Experimental results on both synthetic and real datasets show that the pruning ratio of SLR + FFT can be above 99%. Applying the retrieval technique to real stocks resulted in the discovery many interesting patterns, some of which are presented in the paper. Also, using ER2 as the similarity measure for on-line signature recognition yielded high accuracy.