Multidimensional access methods
ACM Computing Surveys (CSUR)
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Mining top-n local outliers in large databases
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Data Mining and Knowledge Discovery
The Case against Accuracy Estimation for Comparing Induction Algorithms
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Distance-based outliers: algorithms and applications
The VLDB Journal — The International Journal on Very Large Data Bases
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier Mining in Large High-Dimensional Data Sets
IEEE Transactions on Knowledge and Data Engineering
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Very efficient mining of distance-based outliers
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Detecting distance-based outliers in streams of data
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Outlier detection using default reasoning
Artificial Intelligence
A New Approach to Outlier Detection
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Online Outlier Detection Based on Relative Neighbourhood Dissimilarity
WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Efficiently finding unusual shapes in large image databases
Data Mining and Knowledge Discovery
Detecting outlier samples in multivariate time series dataset
Knowledge-Based Systems
Distance-based outlier queries in data streams: the novel task and algorithms
Data Mining and Knowledge Discovery
ODDC: outlier detection using distance distribution clustering
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
RSKT'08 Proceedings of the 3rd international conference on Rough sets and knowledge technology
Expert Systems with Applications: An International Journal
A distributed approach to detect outliers in very large data sets
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Mining outliers with ensemble of heterogeneous detectors on random subspaces
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Intelligent agent-based intrusion detection system using enhanced multiclass SVM
Computational Intelligence and Neuroscience
Exploiting domain knowledge to detect outliers
Data Mining and Knowledge Discovery
A multivariate fuzzy system applied for outliers detection
Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
Hi-index | 0.00 |
A distance-based outlier detection method that finds the top outliers in an unlabeled data set and provides a subset of it, called outlier detection solving set, that can be used to predict the outlierness of new unseen objects, is proposed. The solving set includes a sufficient number of points that permits the detection of the top outliers by considering only a subset of all the pairwise distances from the data set. The properties of the solving set are investigated, and algorithms for computing it, with subquadratic time requirements, are proposed. Experiments on synthetic and real data sets to evaluate the effectiveness of the approach are presented. A scaling analysis of the solving set size is performed, and the false positive rate, that is, the fraction of new objects misclassified as outliers using the solving set instead of the overall data set, is shown to be negligible. Finally, to investigate the accuracy in separating outliers from inliers, ROC analysis of the method is accomplished. Results obtained show that using the solving set instead of the data set guarantees a comparable quality of the prediction, but at a lower computational cost.