On community outliers and their efficient detection in information networks
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A bipartite graph model and mutually reinforcing analysis for review sites
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Spatio-temporal outlier detection based on context: a summary of results
AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part I
Mining at most top-K% spatio-temporal outlier based context: a summary of results
AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II
Hi-index | 0.00 |
The identifying of contextual outliers allows the discovery of anomalous behavior that other forms of outlier detection cannot find. What may appear to be normal behavior with respect to the entire data set can be shown to be anomalous by subsetting the data according to specific spatial or temporal context. However, in many real-world applications, we may not have sufficient a priori contextual information to discover these contextual outliers. This paper addresses the problem by proposing a probabilistic approach based on random walks, which can simultaneously explore meaningful contexts and score contextual outliers therein. Our approach has several advantages including producing outlier scores which can be interpreted as stationary expectations and their calculation in closed form in polynomial time. In addition, we show that point outlier detection using the stationary distribution is a special case of our approach. It allows us to find both global and contextual outliers simultaneously and to create a meaningful ranked list consisting of both types of outliers. This is a major departure from existing work where an algorithm typically identifies one type of outlier. The effectiveness of our method is justified by empirical results on real data sets, with comparison to related work.