Detecting outliers on arbitrary data streams using anytime approaches
Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques
AnyOut: anytime outlier detection on streaming data
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Hi-index | 0.00 |
This paper introduces a new extension of outlier detection approaches and a new concept, class separation through variance. We show that accumulating information about the outlierness of points in multiple subspaces leads to a ranking in which classes with differing variance naturally tend to separate. Exploiting this leads to a highly effective and efficient unsupervised class separation approach, especially useful in the difficult case of heavily overlapping distributions. Unlike typical outlier detection algorithms, this method can be applied beyond the `rare classes' case with great success. Two novel algorithms that implement this approach are provided. Additionally, experiments show that the novel methods typically outperform other state-of-the-art outlier detection methods on high dimensional data such as Feature Bagging, SOE1, LOF, ORCA and Robust Mahalanobis Distance and competes even with the leading supervised classification methods.