BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Anomaly Detection over Noisy Data using Learned Probability Distributions
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding (Recently) Frequent Items in Distributed Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Feature bagging for outlier detection
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Mining distance-based outliers from large databases in any metric space
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier detection by sampling with accuracy guarantees
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Online outlier detection in sensor data using non-parametric models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Ranking outliers using symmetric neighborhood relationship
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Detecting Current Outliers: Continuous Outlier Detection over Time-Series Data Streams
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
DISTRO: a system for detecting global outliers from distributed data streams with privacy protection
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Hi-index | 0.00 |
In many applications, stream data are too voluminous to be collected in a central fashion and often transmitted on a distributed network. In this paper, we focus on the outlier detection over distributed data streams in real time, firstly, we formalize the problem of outlier detection using the kernel density estimation technique. Then, we adopt the fading strategy to keep pace with the transient and evolving natures of stream data, and mico-cluster technique to conquer the data partition and "one-pass" scan. Furthermore, our extensive experiments with synthetic and real data show that the proposed algorithm is efficient and effective compared with existing outlier detection algorithms, and more suitable for data streams.