Outlier Detection with Explanation Facility
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Subspace and projected clustering: experimental evaluation and analysis
Knowledge and Information Systems
Outlier Detection with a Hybrid Artificial Intelligence Method
MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
Expert Systems with Applications: An International Journal
Incremental connectivity-based outlier factor algorithm
VoCS'08 Proceedings of the 2008 international conference on Visions of Computer Science: BCS International Academic Conference
Robust image annotation via simultaneous feature and sample outlier pursuit
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Review: A review of novelty detection
Signal Processing
Hi-index | 0.00 |
Outlier detection is concerned with discovering exceptional behaviors of objects. Its theoretical principle and practical implementation lay a foundation for some important applications such as credit card fraud detection, discovering criminal behaviors in e-commerce, discovering computer intrusion, etc. In this paper, we first present a unified model for several existing outlier detection schemes, and propose a compatibility theory, which establishes a framework for describing the capabilities for various outlier formulation schemes in terms of matching users'intuitions. Under this framework, we show that the density-based scheme is more powerful than the distance-based scheme when a dataset contains patterns with diverse characteristics. The density-based scheme, however, is less effective when the patterns are of comparable densities with the outliers. We then introduce a connectivity-based scheme that improves the effectiveness of the density-based scheme when a pattern itself is of similar density as an outlier. We compare density-based and connectivity-based schemes in terms of their strengths and weaknesses, and demonstrate applications with different features where each of them is more effective than the other. Finally, connectivity-based and density-based schemes are comparatively evaluated on both real-life and synthetic datasets in terms of recall, precision, rank power and implementation-free metrics.