BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Computing depth contours of bivariate point clouds
Computational Statistics & Data Analysis - Special issue on classification
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Incremental Clustering for Mining in a Data Warehousing Environment
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Maintaining variance and k-medians over data stream windows
Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering Data Streams: Theory and Practice
IEEE Transactions on Knowledge and Data Engineering
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
The CQL continuous query language: semantic foundations and query execution
The VLDB Journal — The International Journal on Very Large Data Bases
Online clustering of parallel data streams
Data & Knowledge Engineering
Adaptive Clustering for Multiple Evolving Streams
IEEE Transactions on Knowledge and Data Engineering
Online outlier detection in sensor data using non-parametric models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Density-based clustering for real-time stream data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A framework for clustering evolving data streams
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Resource sharing in continuous sliding-window aggregates
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Detecting distance-based outliers in streams of data
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A shared execution strategy for multiple pattern mining requests over streaming data
Proceedings of the VLDB Endowment
Interactive visual exploration of neighbor-based patterns in data streams
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
An optimal strategy for monitoring top-k queries in streaming windows
Proceedings of the 14th International Conference on Extending Database Technology
CLUES: a unified framework supporting interactive exploration of density-based clusters in streams
Proceedings of the 20th ACM international conference on Information and knowledge management
Summarization and matching of density-based clusters in streaming environments
Proceedings of the VLDB Endowment
Shared execution strategy for neighbor-based pattern mining requests over streaming windows
ACM Transactions on Database Systems (TODS)
Dense subgraph maintenance under streaming edge weight updates for real-time story identification
Proceedings of the VLDB Endowment
AnyOut: anytime outlier detection on streaming data
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Probabilistic distance based abnormal pattern detection in uncertain series data
Knowledge-Based Systems
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Mining and linking patterns across live data streams and stream archives
Proceedings of the VLDB Endowment
A framework of traveling companion discovery on trajectory data streams
ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
Hi-index | 0.00 |
The discovery of complex patterns such as clusters, outliers, and associations from huge volumes of streaming data has been recognized as critical for many domains. However, pattern detection with sliding window semantics, as required by applications ranging from stock market analysis to moving object tracking remains largely unexplored. Applying static pattern detection algorithms from scratch to every window is prohibitively expensive due to their high algorithmic complexity. This work tackles this problem by developing the first solution for incremental detection of neighbor-based patterns specific to sliding window scenarios. The specific pattern types covered in this work include density-based clusters and distance-based outliers. Incremental pattern computation in highly dynamic streaming environments is challenging, because purging a large amount of to-be-expired data from previously formed patterns may cause complex pattern changes including migration, splitting, merging and termination of these patterns. Previous incremental neighbor-based pattern detection algorithms, which were typically not designed to handle sliding windows, such as incremental DBSCAN, are not able to solve this problem efficiently in terms of both CPU and memory consumption. To overcome this, we exploit the "predictability" property of sliding windows to elegantly discount the effect of expiring objects on the remaining pattern structures. Our solution achieves minimal CPU utilization, while still keeping the memory utilization linear in the number of objects in the window. Our comprehensive experimental study, using both synthetic as well as real data from domains of stock trades and moving object monitoring, demonstrates superiority of our proposed strategies over alternate methods in both CPU and memory utilization.