Statistical Pattern Recognition: A Review
IEEE Transactions on Pattern Analysis and Machine Intelligence
ACM Computing Surveys (CSUR)
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Discovering cluster-based local outliers
Pattern Recognition Letters
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
A Survey of Outlier Detection Methodologies
Artificial Intelligence Review
An introduction to ROC analysis
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
On the Impact of Dissimilarity Measure in k-Modes Clustering Algorithm
IEEE Transactions on Pattern Analysis and Machine Intelligence
Detecting anomalous records in categorical datasets
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A genetic approach for efficient outlier detection in projected space
Pattern Recognition
Mining Distance-Based Outliers from Categorical Data
ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
A Scalable and Efficient Outlier Detection Strategy for Categorical Data
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
A new initialization method for categorical data clustering
Expert Systems with Applications: An International Journal
ACM Computing Surveys (CSUR)
OutRank: ranking outliers in high dimensional data
ICDEW '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering Workshop
K-Distributions: A New Algorithm for Clustering Categorical Data
ICIC '07 Proceedings of the 3rd International Conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence
Data clustering: 50 years beyond K-means
Pattern Recognition Letters
Statistical outlier detection using direct density ratio estimation
Knowledge and Information Systems
A fast greedy algorithm for outlier mining
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Anomaly Detection for Discrete Sequences: A Survey
IEEE Transactions on Knowledge and Data Engineering
Information-Theoretic Outlier Detection for Large-Scale Categorical Data
IEEE Transactions on Knowledge and Data Engineering
Authorship attribution as a case of anomaly detection: A neural network model
International Journal of Hybrid Intelligent Systems
A combined approach to tackle imbalanced data sets
International Journal of Hybrid Intelligent Systems
Hi-index | 0.00 |
Outlier detection being an important data mining problem has attracted a lot of research interest in the recent past. As a result, various methods for outlier detection have been developed particularly for dealing with numerical data, whereas categorical data needs some attention. Addressing this requirement, we propose a two-phase algorithm for detecting outliers in categorical data based on a novel definition of outliers. In the first phase, this algorithm explores a clustering of the given data, followed by the ranking phase for determining the set of most likely outliers. The proposed algorithm is expected to perform better as it can identify different types of outliers, employing two independent ranking schemes based on the attribute value frequencies and the inherent clustering structure in the given data. Unlike some existing methods, the computational complexity of this algorithm is not affected by the number of outliers to be detected. The efficacy of this algorithm is demonstrated through experiments on various public domain categorical data sets.