BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Anomaly Detection over Noisy Data using Learned Probability Distributions
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding (Recently) Frequent Items in Distributed Data Streams
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Mining distance-based outliers from large databases in any metric space
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Ranking outliers using symmetric neighborhood relationship
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Hi-index | 0.00 |
Stream data are often transmitted over a distributed network, but in many cases, are too voluminous to be collected in a central location. Instead, we must perform distributed computations, guaranteeing high quality results in real-time even as new data arrive. In this paper, firstly, we formalize the problem of continuous outlier detection over distributed evolving data streams. Then, two novel outlier measures and algorithms are proposed which can identify outliers in a single pass. Furthermore, our experiments with synthetic and real data show that the proposed methods are both efficient and effective compared with existing outlier detection algorithms.