Parameter-free anomaly detection for categorical data
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
A ranking-based algorithm for detection of outliers in categorical data
International Journal of Hybrid Intelligent Systems
Hi-index | 0.00 |
Distance-based outlier detection is an important data mining technique that finds abnormal data objects according to some distance function. However, when this technique is applied to high-dimensional categorical data, a traditional simple matching dissimilarity measure does not provide an adequate model. In this article, we employ a new common- neighbor-based distance function to measure the proximity between a pair of data points. Experiments show that better outlier mining results can be achieved when the new distance function is utilized rather than a conventional simple matching dissimilarity measure.