LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Detecting graph-based spatial outliers: algorithms and applications (a summary of results)
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Neighborhood based detection of anomalies in high dimensional spatio-temporal sensor datasets
Proceedings of the 2004 ACM symposium on Applied computing
Detecting anomalous records in categorical datasets
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A Scalable and Efficient Outlier Detection Strategy for Categorical Data
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Hi-index | 0.02 |
Spatial Categorical Outlier Detection (SCOD) has attracted considerable attentions from the areas of spatial data mining and geological analysis. When encountering an SCOD problem, some researchers introduce to utilize Spatial Numerical Outlier Detection measures by mapping categorical attributes to continuous ones. However, such approaches fail to capture the special properties of spatial categorical data, which is prone to incur the masking and swamping issues. In this paper, we model spatial dependencies between spatial categorical observations and propose a Pair Correlation Function(PCF) based method to detect SCOs. First, a new metric, named Pair Correlation Ratio(PCR), is estimated for each pair of categorical combinations based on their co-occurrence frequency at different spatial distances. Then discrete PCRs are fitted in a continuous function of distances. The outlier score is computed using the average PCRs between referenced object and its spatial neighbors. Observations with the lowest PCRs are labeled as potential SCOs. Extensive experiments demonstrated that PCF based method outperformed existing approaches.