ACM Computing Surveys (CSUR)
Dimensionality reduction for similarity searching in dynamic databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Hancock: a language for extracting signatures from data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
An on-line algorithm for fitting straight lines between data ranges
Communications of the ACM
On the approximation of curves by line segments using dynamic programming
Communications of the ACM
Segment-based approach for subsequence searches in sequence databases
Proceedings of the 2001 ACM symposium on Applied computing
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Wavelet synopses with error guarantees
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Approximate Queries and Representations for Large Data Sequences
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
An Online Algorithm for Segmenting Time Series
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
ECG Segmentation Using Time-Warping
IDA '97 Proceedings of the Second International Symposium on Advances in Intelligent Data Analysis, Reasoning about Data
Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping
IEEE Transactions on Knowledge and Data Engineering
Supporting Content-Based Searches on Time Series via Approximation
SSDBM '00 Proceedings of the 12th International Conference on Scientific and Statistical Database Management
A symbolic representation of time series, with implications for streaming algorithms
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Similarity search in massive time series databases
Similarity search in massive time series databases
Compressing historical information in sensor networks
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A Multiresolution Symbolic Representation of Time Series
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Deterministic wavelet thresholding for maximum-error metrics
PODS '04 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Effective variation management for pseudo periodical streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Tribeca: a system for managing large databases of network traffic
ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
StatStream: statistical monitoring of thousands of data streams in real time
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Mining approximate top-k subspace anomalies in multi-dimensional time-series data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Streaming Time Series Summarization Using User-Defined Amnesic Functions
IEEE Transactions on Knowledge and Data Engineering
Novel Online Methods for Time Series Segmentation
IEEE Transactions on Knowledge and Data Engineering
Syntactic recognition of ECG signals by attributed finite automata
Pattern Recognition
Online Segmentation of Time Series Based on Polynomial Least-Squares Approximations
IEEE Transactions on Pattern Analysis and Machine Intelligence
Reducing data transfer for charts on adaptive web sites
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Hi-index | 0.00 |
The volume of time series data grows rapidly in various applications such as network traffic management, telecommunications, finance and sensor network. To reduce the cost of storage, transmission and processing of time series data, the need for more compact representations of time series data is compelling. Segmentation is one of the most commonly used methods to meet this requirement. Both PLA and PPA are common segmentation methods which divide a time series into segments and use a linear function or a polynomial function to approximate each segment, respectively. However, while most of the current PLA and PPA methods aim to minimize the holistic error between the approximation and the original time series, few works try to represent time series as compact as possible with an error bound guarantee on each data point. Furthermore, in many real world situations, the patterns of the time series do not follow a constant rule such that using only one type of functions may not yield the best compaction. Motivated by these observations, we propose an online segmentation algorithm which approximates time series by a set of different types of candidate functions (polynomials of different orders, exponential functions, etc.) and adaptively chooses the most compact one as the pattern of the time series changes. A challenge in this approach is to determine the approximation function on the fly ("online"). Thereby, we further propose a novel method to efficiently generate the compact approximation of a time series in an online fashion for several types of candidate functions. This method incrementally narrows the feasible coefficient spaces of candidate functions in coefficient coordinate systems such that it can make each segment as long as possible given an error bound on each data point. Extensive experimental results show that our algorithm generates more compact approximations of the time series with lower average errors than the state-of-the-art algorithm.