Randomization tests
An introduction to Kolmogorov complexity and its applications (2nd ed.)
An introduction to Kolmogorov complexity and its applications (2nd ed.)
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Multidimensional binary search trees used for associative searching
Communications of the ACM
Mining top-n local outliers in large databases
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Prediction algorithms and confidence measures based on algorithmic randomness theory
Theoretical Computer Science - Natural computing
Transductive Confidence Machines for Pattern Recognition
ECML '02 Proceedings of the 13th European Conference on Machine Learning
High Dimensional Similarity Search With Space Filling Curves
Proceedings of the 17th International Conference on Data Engineering
Machine-Learning Applications of Algorithmic Randomness
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Enhancing Effectiveness of Outlier Detections for Low Density Patterns
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
A Unified Approach to Detecting Spatial Outliers
Geoinformatica
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning relational probability trees
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier Mining in Large High-Dimensional Data Sets
IEEE Transactions on Knowledge and Data Engineering
Network anomaly detection based on TCM-KNN algorithm
ASIACCS '07 Proceedings of the 2nd ACM symposium on Information, computer and communications security
Machine learning approaches to network anomaly detection
SYSML'07 Proceedings of the 2nd USENIX workshop on Tackling computer systems problems with machine learning techniques
An anomaly intrusion detection method using the CSI-KNN algorithm
Proceedings of the 2008 ACM symposium on Applied computing
TCM-KNN scheme for network anomaly detection using feature-based optimizations
Proceedings of the 2008 ACM symposium on Applied computing
A Novel Data Mining Method for Network Anomaly Detection Based on Transductive Scheme
ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Advances in Neural Networks
TCM-KNN algorithm for supervised network intrusion detection
PAISI'07 Proceedings of the 2007 Pacific Asia conference on Intelligence and security informatics
Optimizing network anomaly detection scheme using instance selection mechanism
GLOBECOM'09 Proceedings of the 28th IEEE conference on Global telecommunications
Detecting activities from body-worn accelerometers via instance-based algorithms
Pervasive and Mobile Computing
Topology preserving SOM with transductive confidence machine
DS'10 Proceedings of the 13th international conference on Discovery science
Journal of Intelligent Information Systems
Quantifying the reliability of fault classifiers
Information Sciences: an International Journal
Hi-index | 0.00 |
Outlier detection can uncover malicious behavior in fields like intrusion detection and fraud analysis. Although there has been a significant amount of work in outlier detection, most of the algorithms proposed in the literature are based on a particular definition of outliers (e.g., density-based), and use ad-hoc thresholds to detect them. In this paper we present a novel technique to detect outliers with respect to an existing clustering model. However, the test can also be successfully utilized to recognize outliers when the clustering information is not available. Our method is based on Transductive Confidence Machines, which have been previously proposed as a mechanism to provide individual confidence measures on classification decisions. The test uses hypothesis testing to prove or disprove whether a point is fit to be in each of the clusters of the model. We experimentally demonstrate that the test is highly robust, and produces very few misdiagnosed points, even when no clustering information is available. Furthermore, our experiments demonstrate the robustness of our method under the circumstances of data contaminated by outliers. We finally show that our technique can be successfully applied to identify outliers in a noisy data set for which no information is available (e.g., ground truth, clustering structure, etc.). As such our proposed methodology is capable of bootstrapping from a noisy data set a clean one that can be used to identify future outliers.