Robust regression and outlier detection
Robust regression and outlier detection
ACM Computing Surveys (CSUR)
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Support vector domain description
Pattern Recognition Letters - Special issue on pattern recognition in practice VI
On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
SVM binary classifier ensembles for image classification
Proceedings of the tenth international conference on Information and knowledge management
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
When Is ''Nearest Neighbor'' Meaningful?
ICDT '99 Proceedings of the 7th International Conference on Database Theory
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Distance-based outliers: algorithms and applications
The VLDB Journal — The International Journal on Very Large Data Bases
PEBL: positive example based learning for Web page classification using SVM
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier Mining in Large High-Dimensional Data Sets
IEEE Transactions on Knowledge and Data Engineering
Example-Based Robust Outlier Detection in High Dimensional Datasets
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Detecting outliers using transduction and statistical testing
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Outlier detection is a useful technique in such areas as fraud detection, financial analysis and health monitoring. Many recent approaches detect outliers according to reasonable, pre-defined concepts of an outlier (e.g., distance-based, density-based, etc.). However, the definition of an outlier differs between users or even datasets. This paper presents a solution to this problem by including input from the users. Our OBE (Outlier By Example) system is the first that allows users to provide examples of outliers in low-dimensional datasets. By incorporating a small number of such examples, OBE can successfully develop an algorithm by which to identify further outliers based on their outlierness. Several algorithmic challenges and engineering decisions must be addressed in building such a system. We describe the key design decisions and algorithms in this paper. In order to interact with users having different degrees of domain knowledge, we develop two detection schemes: OBE-Fraction and OBE-RF. Our experiments on both real and synthetic datasets demonstrate that OBE can discover values that a user would consider outliers.