Computational geometry: an introduction
Computational geometry: an introduction
Computing depth contours of bivariate point clouds
Computational Statistics & Data Analysis - Special issue on classification
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Mining top-n local outliers in large databases
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting graph-based spatial outliers: algorithms and applications (a summary of results)
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
OPTICS-OF: Identifying Local Outliers
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Projected outlier detection in high-dimensional mixed-attributes data set
Expert Systems with Applications: An International Journal
Detecting outlying properties of exceptional objects
ACM Transactions on Database Systems (TODS)
ACM Computing Surveys (CSUR)
Parameter-free anomaly detection for categorical data
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
An optimization model for outlier detection in categorical data
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
International Journal of Computational Science and Engineering
Review: A review of novelty detection
Signal Processing
Hi-index | 0.00 |
As a widely used data mining technique, outlier detection is a process which aims at finding anomalies with good explanations. Most existing methods are designed for numeric data. They will have problems with real-life applications that contain categorical data. In this paper, we introduce a novel outlier mining method based on a hypergraph model. Since hypergraphs precisely capture the distribution characteristics in data subspaces, this method is effective in identifying anomalies in dense subspaces and presents good interpretations for the local outlierness. By selecting the most relevant subspaces, the problem of "curse of dimensionality" in very large databases can also be ameliorated. Furthermore, the connectivity property is used to replace the distance metrics, so that the distance-based computation is not needed anymore, which enhances the robustness for handling missing-value data. The fact, that connectivity computation facilitates the aggregation operations supported by most SQL-compatible database systems, makes the mining process much efficient. Finally, experiments and analysis show that our method can find outliers in categorical data with good performance and quality.