LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Discovery-Driven Exploration of OLAP Data Cubes
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Density-based clustering of uncertain data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Probabilistic skylines on uncertain data
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Real-Time Monitoring of Uncertain Data Streams Using Probabilistic Similarity
RTSS '07 Proceedings of the 28th IEEE International Real-Time Systems Symposium
Ranking queries on uncertain data: a probabilistic threshold approach
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Sliding-window top-k queries on uncertain streams
Proceedings of the VLDB Endowment
A Framework for Clustering Uncertain Data Streams
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Improved approximate detection of duplicates for data streams over sliding windows
Journal of Computer Science and Technology
Fast top-k distance-based outlier detection on uncertain data
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
Hi-index | 0.00 |
Outlier detection is a very useful technique in many applications, where data is generally uncertain and could be described using probability. While having been studied intensively in the field of deterministic data, outlier detection is still novel in the emerging uncertain data field. In this paper, we study the semantic of outlier detection on probabilistic data stream and present a new definition of distance-based outlier over sliding window. We then show the problem of detecting an outlier over a set of possible world instances is equivalent to the problem of finding the k-th element in its neighborhood. Based on this observation, a dynamic programming algorithm (DPA) is proposed to reduce the detection cost from O(2|R(e,d)|) to O(|kċR(e,d)|), where R(e, d) is the d-neighborhood of e. Furthermore, we propose a pruning-based approach (PBA) to effectively and efficiently filter non-outliers on single window, and dynamically detect recent m elements incrementally. Finally, detailed analysis and thorough experimental results demonstrate the efficiency and scalability of our approach.