Elements of information theory
Elements of information theory
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Entropy-based criterion in categorical clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Fast Distributed Outlier Detection in Mixed-Attribute Data Sets
Data Mining and Knowledge Discovery
Mining Distance-Based Outliers from Categorical Data
ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
A Scalable and Efficient Outlier Detection Strategy for Categorical Data
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
ACM Computing Surveys (CSUR)
CoCo: coding cost for parameter-free outlier detection
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A LRT framework for fast spatial anomaly detection
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Information theoretical analysis of multivariate correlation
IBM Journal of Research and Development
HOT: hypergraph-based outlier test for categorical data
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Establishing fraud detection patterns based on signatures
ICDM'06 Proceedings of the 6th Industrial Conference on Data Mining conference on Advances in Data Mining: applications in Medicine, Web Mining, Marketing, Image and Signal Mining
Hi-index | 0.00 |
Outlier detection can usually be considered as a preprocessing step for locating, from a data set, the objects that do not conform to well defined notions of expected behaviors. It is a major issue of data mining for discovering novel or rare events, actions and phenomena. We investigate outlier detection from a categorical data set. The problem is especially challenging because of difficulty in defining a meaningful similarity measure for categorical data. In this paper, we propose a formal definition of outliers and formulize outlier detection as an optimization problem. To solve the optimization problem, we design a practical and parameter-free method, named ITB. Experimental results show that the ITB method is much more effective and efficient than existing mainstream methods.