LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Novelty detection: a review—part 2: neural network based approaches
Signal Processing
Support Vector Data Description
Machine Learning
Learning the Kernel Matrix with Semidefinite Programming
The Journal of Machine Learning Research
Outlier Detection with Kernel Density Functions
MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
ACM Computing Surveys (CSUR)
Semi-supervised learning using label mean
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A Convex Method for Locating Regions of Interest with Multi-instance Learning
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
A Minimax Theorem with Applications to Machine Learning, Signal Processing, and Finance
SIAM Journal on Optimization
Anomaly Detection for Discrete Sequences: A Survey
IEEE Transactions on Knowledge and Data Engineering
Prototype-Based Domain Description for One-Class Classification
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.01 |
Novelty detection has been well-studied for many years and has found a wide range of applications, but correctly identifying the outliers is still a hard problem because of the diverse variation and the small quantity of such outliers. We address the problem using several distinct characteristics of the outliers and the normal patterns. First, normal patterns are usually grouped together, forming clusters in the high density regions of the data space. Second, outliers are characteristically very different from the normal patterns, and hence tend to be located far away from the normal patterns in the data space. Third, the number of outliers is generally very small in a given dataset. Based on these observations, we can envisage that the appropriate decision boundary segregating the outliers and the normal patterns usually lies in some low density regions of the data space. This is referred to as cluster assumption. The resultant optimization problem to learn the decision function can be solved using the mixed integer programming approach. Following that, we present a cutting plane algorithm together with a multiple kernel learning technique to solve the convex relaxation of the optimization problem. Specifically, we make use of the scarcity of the outliers to find a violating solution to the cutting plane algorithm. Experimental results with several benchmark datasets show that our proposed novelty detection method outperforms existing hyperplane and density estimation-based novelty detection techniques. We subsequently apply our method to the prediction of banking failures to identify potential bank failures or high risk banks through the traits of financial distress.