Efficient algorithms for mining outliers from large data sets

Authors:
Sridhar Ramaswamy;Rajeev Rastogi;Kyuseok Shim
Affiliations:
Epiphany Inc., Palo Alto, CA;Bell Laboratories, Murray Hill, NJ;Korea Advanced Institute of Science and Technology and Advanced Information Technology Research Center at KAIST, Taejon, KOREA
Venue:
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Year:
2000

Citing 12
Cited 240

Algorithms for clustering data

Algorithms for clustering data
The design and analysis of spatial data structures

The design and analysis of spatial data structures
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Discovery-Driven Exploration of OLAP Data Cubes

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases

LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Outlier detection for high dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Robust space transformations for distance-based operations

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Mining top-n local outliers in large databases

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting graph-based spatial outliers: algorithms and applications (a summary of results)

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Findout: finding outliers in very large datasets

Knowledge and Information Systems
Fast Outlier Detection in High Dimensional Spaces

PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Outlier Detection Integrating Semantic Knowledge

WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
Outlier Detection Using Replicator Neural Networks

DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Enhancing Effectiveness of Outlier Detections for Low Density Patterns

PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Declustering Spatial Objects by Clustering for Parallel Disks

DEXA '01 Proceedings of the 12th International Conference on Database and Expert Systems Applications
Discovering cluster-based local outliers

Pattern Recognition Letters
Outlier Detection Algorithms in Data Mining Systems

Programming and Computing Software
Mining association rules on significant rare data using relative support

Journal of Systems and Software
Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets

IEEE Transactions on Knowledge and Data Engineering
Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Detecting region outliers in meteorological data

GIS '03 Proceedings of the 11th ACM international symposium on Advances in geographic information systems
Mining distance-based outliers in near linear time with randomization and a simple pruning rule

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed deviation detection in sensor networks

ACM SIGMOD Record
Framework for mining web content outliers

Proceedings of the 2004 ACM symposium on Applied computing
Cleaning the Spurious Links in Data

IEEE Intelligent Systems
Using unsupervised link discovery methods to find interesting facts and connections in a bibliography dataset

ACM SIGKDD Explorations Newsletter
A Survey of Outlier Detection Methodologies

Artificial Intelligence Review
MORPHEUS: motif oriented representations to purge hostile events from unlabeled sequences

Proceedings of the 2004 ACM workshop on Visualization and data mining for computer security
A vertical distance-based outlier detection method with local pruning

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Outlier Mining in Large High-Dimensional Data Sets

IEEE Transactions on Knowledge and Data Engineering
An effective and efficient algorithm for high-dimensional outlier detection

The VLDB Journal — The International Journal on Very Large Data Bases
Mining web content outliers using structure oriented weighting techniques and N-grams

Proceedings of the 2005 ACM symposium on Applied computing
Detection and prediction of distance-based outliers

Proceedings of the 2005 ACM symposium on Applied computing
Feature bagging for outlier detection

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Using Datacube Aggregates for Approximate Querying and Deviation Detection

IEEE Transactions on Knowledge and Data Engineering
Parallel Algorithms for Distance-Based and Density-Based Outliers

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Distance-Based Detection and Prediction of Outliers

IEEE Transactions on Knowledge and Data Engineering
Enhancing Data Analysis with Noise Removal

IEEE Transactions on Knowledge and Data Engineering
An outlier-based data association method for linking criminal incidents

Decision Support Systems - Special issue: Intelligence and security informatics
Detecting outliers using transduction and statistical testing

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining distance-based outliers from large databases in any metric space

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier detection by active learning

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier detection by sampling with accuracy guarantees

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Reverse Nearest Neighbor Search in Metric Spaces

IEEE Transactions on Knowledge and Data Engineering
SLOM: a new measure for local spatial outliers

Knowledge and Information Systems
Finding centric local outliers in categorical/numerical spaces

Knowledge and Information Systems
Online outlier detection in sensor data using non-parametric models

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Detecting outliers in interval data

Proceedings of the 44th annual Southeast regional conference
Problem diagnosis in large-scale computing environments

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
The pairwise attribute noise detection algorithm

Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
Approximate range---sum query answering on data cubes with probabilistic guarantees

Journal of Intelligent Information Systems
Identifying noisy features with the Pairwise Attribute Noise Detection Algorithm

Intelligent Data Analysis
Web outlier mining: Discovering outliers from web datasets

Intelligent Data Analysis
Conditional Anomaly Detection

IEEE Transactions on Knowledge and Data Engineering
An overview of anomaly detection techniques: Existing solutions and latest technological trends

Computer Networks: The International Journal of Computer and Telecommunications Networking
A hybrid machine learning approach to network anomaly detection

Information Sciences: an International Journal
A trend pattern assessment approach to microarray gene expression profiling data analysis

Pattern Recognition Letters
From outliers to prototypes: Ordering data

Neurocomputing
Network anomaly detection with incomplete audit data

Computer Networks: The International Journal of Computer and Telecommunications Networking
Outlier detection in sensor networks

Proceedings of the 8th ACM international symposium on Mobile ad hoc networking and computing
A trimmed mean approach to finding spatial outliers

Intelligent Data Analysis
Outlier detection by logic programming

ACM Transactions on Computational Logic (TOCL)
Condensed Nearest Neighbor Data Domain Description

IEEE Transactions on Pattern Analysis and Machine Intelligence
Hos-Miner: a system for detecting outlyting subspaces of high-dimensional data

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Very efficient mining of distance-based outliers

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Detecting distance-based outliers in streams of data

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A genetic approach for efficient outlier detection in projected space

Pattern Recognition
Mining approximate top-k subspace anomalies in multi-dimensional time-series data

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A Bayesian method for guessing the extreme values in a data set?

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Multi-scale anomaly detection algorithm based on infrequent pattern of time series

Journal of Computational and Applied Mathematics
LDBOD: A novel local distribution based outlier detector

Pattern Recognition Letters
DMTracker: finding bugs in large-scale parallel programs by detecting anomaly in data movements

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Fast mining of distance-based outliers in high-dimensional datasets

Data Mining and Knowledge Discovery
High performance computing for spatial outliers detection using parallel wavelet transform

Intelligent Data Analysis
CURIO: a fast outlier and outlier cluster detection algorithm for large datasets

AIDM '07 Proceedings of the 2nd international workshop on Integrating artificial intelligence and data mining - Volume 84
Angle-based outlier detection in high-dimensional data

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Local peculiarity factor and its application in outlier detection

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier detection using default reasoning

Artificial Intelligence
Network Anomalous Attack Detection Based on Clustering and Classifier

Computational Intelligence and Security
Unsupervised Outlier Detection in Sensor Networks Using Aggregation Tree

ADMA '07 Proceedings of the 3rd international conference on Advanced Data Mining and Applications
Efficiently finding unusual shapes in large image databases

Data Mining and Knowledge Discovery
DIVFRP: An automatic divisive hierarchical clustering method based on the furthest reference points

Pattern Recognition Letters
Outlier Detection Based on Granular Computing

RSCTC '08 Proceedings of the 6th International Conference on Rough Sets and Current Trends in Computing
Knowledge Discovery from Honeypot Data for Monitoring Malicious Attacks

AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
DOLPHIN: An efficient algorithm for mining distance-based outliers in very large datasets

ACM Transactions on Knowledge Discovery from Data (TKDD)
Some issues about outlier detection in rough set theory

Expert Systems with Applications: An International Journal
Projected outlier detection in high-dimensional mixed-attributes data set

Expert Systems with Applications: An International Journal
Finding anomalous periodic time series

Machine Learning
Detecting outlying properties of exceptional objects

ACM Transactions on Database Systems (TODS)
Domain independent data discrepancy detection using ensemble learning

ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
Hiding distinguished ones into crowd: privacy-preserving publishing data with outliers

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Generating design knowledge though data mining

Journal of Computing Sciences in Colleges
Guessing the extreme values in a data set: a Bayesian method and its applications

The VLDB Journal — The International Journal on Very Large Data Bases
Parameterless outlier detection in data streams

Proceedings of the 2009 ACM symposium on Applied Computing
A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Mining Outliers with Faster Cutoff Update and Space Utilization

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Anomaly detection: A survey

ACM Computing Surveys (CSUR)
Outlier detection based on rough sets theory

Intelligent Data Analysis
Efficient anomaly monitoring over moving object trajectory streams

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A hybrid novelty score and its use in keystroke dynamics-based user authentication

Pattern Recognition
Mining in Large Noisy Domains

Journal of Data and Information Quality (JDIQ)
SubCOID: an attempt to explore cluster-outlier iterative detection approach to multi-dimensional data analysis in subspace

Proceedings of the 46th Annual Southeast Regional Conference on XX
Discovering special product features for improving the process of product selection in E-commerce environment

Proceedings of the 11th International Conference on Electronic Commerce
Anomaly detection and spatio-temporal analysis of global climate system

Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
A Comparative Study of Outlier Detection Algorithms

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
A comprehensive survey of numeric and symbolic outlier mining techniques

Intelligent Data Analysis
RE2-CD: Robust and Energy Efficient Cut Detection in Wireless Sensor Networks

WASA '09 Proceedings of the 4th International Conference on Wireless Algorithms, Systems, and Applications
Detecting Projected Outliers in High-Dimensional Data Streams

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Efficient Pruning Schemes for Distance-Based Outlier Detection

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Improving quality assurance in education with web-based services by data mining and mobile technologies

Proceedings of the 2008 Euro American Conference on Telematics and Information Systems
Knowledge discovery from imbalanced and noisy data

Data & Knowledge Engineering
A comparison of outlier detection algorithms for ITS data

Expert Systems with Applications: An International Journal
LoOP: local outlier probabilities

Proceedings of the 18th ACM conference on Information and knowledge management
Anomaly Detection from Call Data Records

PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
An outlier-based data association method for linking criminal incidents

Decision Support Systems - Special issue: Intelligence and security informatics
SOMSO: a self-organizing map approach for spatial outlier detection with multiple attributes

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Distance-based outlier queries in data streams: the novel task and algorithms

Data Mining and Knowledge Discovery
HOT: hypergraph-based outlier test for categorical data

PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Hyperclique pattern based off-topic detection

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Outlier detection with streaming dyadic decomposition

ICDM'07 Proceedings of the 7th industrial conference on Advances in data mining: theoretical aspects and applications
Parallel wavelet transform for spatio-temporal outlier detection in large meteorological data

IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Correlation-based detection of attribute outliers

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
An efficient histogram method for outlier detection

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Efficiently mining regional outliers in spatial data

SSTD'07 Proceedings of the 10th international conference on Advances in spatial and temporal databases
Cell-based outlier detection algorithm: a fast outlier detection algorithm for large datasets

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
A new algorithm for high-dimensional outlier detection based on constrained particle swarm intelligence

RSKT'08 Proceedings of the 3rd international conference on Rough sets and knowledge technology
Fourier transform based spatial outlier mining

IDEAL'09 Proceedings of the 10th international conference on Intelligent data engineering and automated learning
Detecting outliers in categorical record databases based on attribute associations

APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Outlier detection via localized p-value estimation

Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
An information entropy-based approach to outlier detection in rough sets

Expert Systems with Applications: An International Journal
An attack classification mechanism based on multiple support vector machines

ICCSA'07 Proceedings of the 2007 international conference on Computational science and Its applications - Volume Part II
Mining Outliers in Correlated Subspaces for High Dimensional Data Sets

Fundamenta Informaticae - Intelligent Data Analysis in Granular Computing
Anomaly detection in streaming environmental sensor data: A data-driven modeling approach

Environmental Modelling & Software
Mining outliers with faster cutoff update and space utilization

Pattern Recognition Letters
Outlier detection in transactional data

Intelligent Data Analysis
Neighborhood outlier detection

Expert Systems with Applications: An International Journal
Fuzzy clustering-based approach for outlier detection

ACE'10 Proceedings of the 9th WSEAS international conference on Applications of computer engineering
A Framework for Large-Scale Detection of Web Site Defacements

ACM Transactions on Internet Technology (TOIT)
New outlier detection method based on fuzzy clustering

WSEAS Transactions on Information Science and Applications
A reference based analysis framework for analyzing system call traces

Proceedings of the Sixth Annual Workshop on Cyber Security and Information Intelligence Research
Soft fuzzy rough sets for robust feature evaluation and selection

Information Sciences: an International Journal
Mining Outliers with Adaptive Cutoff Update and Space Utilization (RACAS)

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
A fast algorithm for robust mixtures in the presence of measurement errors

IEEE Transactions on Neural Networks
Privacy-preserving matching of spatial datasets with protection against background knowledge

Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems
A distributed approach to detect outliers in very large data sets

EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
On detecting clustered anomalies using SCiForest

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Synchronization based outlier detection

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Towards improving subspace data analysis

Proceedings of the 48th Annual Southeast Regional Conference
Distance-based outlier detection: consolidation and renewed bearing

Proceedings of the VLDB Endowment
Thresholds based outlier detection approach for mining class outliers: An empirical case study on software measurement datasets

Expert Systems with Applications: An International Journal
Two-stage outlier elimination for robust curve and surface fitting

EURASIP Journal on Advances in Signal Processing - Special issue on robust processing of nonstationary signals
Towards robustness and energy efficiency of cut detection in wireless sensor networks

Ad Hoc Networks
Atypicity detection in data streams: A self-adjusting approach

Intelligent Data Analysis - Ubiquitous Knowledge Discovery
Membership enhancement with exponential fuzzy clustering for collaborative filtering

ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
Anomaly detection in monitoring sensor data for preventive maintenance

Expert Systems with Applications: An International Journal
An improved KNN based outlier detection algorithm for large datasets

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Outlier detection by example

Journal of Intelligent Information Systems
Improving gaussian process classification with outlier detection: with applications in image classification

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part IV
Fast outlier detection for very large log data

Expert Systems with Applications: An International Journal
Finding key knowledge attribute subspace of outliers in high-dimensional dataset

Expert Systems with Applications: An International Journal
Active learning and subspace clustering for anomaly detection

Intelligent Data Analysis
Sample-space bright spots removal using density estimation

Proceedings of Graphics Interface 2011
Anomaly detection techniques for a web defacement monitoring service

Expert Systems with Applications: An International Journal
SDDB: a self-dependent and data-based method for constructing bilingual dictionary from the web

APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Detecting outlier sections in us congressional legislation

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Algorithms for speeding up distance-based outlier detection

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
NDoT: nearest neighbor distance based outlier detection technique

PReMI'11 Proceedings of the 4th international conference on Pattern recognition and machine intelligence
Robust fuzzy rough classifiers

Fuzzy Sets and Systems
Finding fraud in health insurance data with two-layer outlier detection approach

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery
A neural network based retrainable framework for robust object recognition with application to mobile robotics

Applied Intelligence
A hybrid approach to outlier detection based on boundary region

Pattern Recognition Letters
A novel outlier detection method for spatio-tempral trajectory data

ICHIT'11 Proceedings of the 5th international conference on Convergence and hybrid information technology
A survey of outlier detection methodologies and their applications

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part I
Motif-based attack detection in network communication graphs

CMS'11 Proceedings of the 12th IFIP TC 6/TC 11 international conference on Communications and multimedia security
Detecting anomalies in graphs with numeric labels

Proceedings of the 20th ACM international conference on Information and knowledge management
LSH based outlier detection and its application in distributed setting

Proceedings of the 20th ACM international conference on Information and knowledge management
Spatial categorical outlier detection: pair correlation function based approach

Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Mining outliers in spatial networks

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
A nonparametric outlier detection for effectively discovering top-n outliers from engineering data

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A fast greedy algorithm for outlier mining

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Ranking outliers using symmetric neighborhood relationship

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
On robust and effective k-anonymity in large databases

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Grid-ODF: detecting outliers effectively and efficiently in large multi-dimensional databases

CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
An auto-stopped hierarchical clustering algorithm integrating outlier detection algorithm

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
A unified subspace outlier ensemble framework for outlier detection

WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Outlier detection in relational data: A case study in geographical information systems

Expert Systems with Applications: An International Journal
Simple instance selection for bankruptcy prediction

Knowledge-Based Systems
Network anomaly detection based on clustering of sequence patterns

ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
Mining bridging rules between conceptual clusters

Applied Intelligence
Introduction to data mining for sustainability

Data Mining and Knowledge Discovery
Visual interactive evolutionary algorithm for high dimensional outlier detection and data clustering problems

International Journal of Bio-Inspired Computation
Mining outliers with ensemble of heterogeneous detectors on random subspaces

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
Visual evaluation of outlier detection models

DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
Hybrid approach to web content outlier mining without query vector

DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
Outlier detection using rough set theory

RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
Isolation-Based Anomaly Detection

ACM Transactions on Knowledge Discovery from Data (TKDD)
A distributed algorithm for outlier detection in a large database

DNIS'05 Proceedings of the 4th international conference on Databases in Networked Information Systems
An optimization model for outlier detection in categorical data

ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
Condensed nearest neighbor data domain description

IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
Collusion set detection through outlier discovery

ISI'05 Proceedings of the 2005 IEEE international conference on Intelligence and Security Informatics
A fuzzy index for detecting spatiotemporal outliers

Geoinformatica
An approach to extract special skills to improve the performance of resume selection

DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
Similarity kernels for nearest neighbor-based outlier detection

IDA'10 Proceedings of the 9th international conference on Advances in Intelligent Data Analysis
Distance-Based outliers in sequences

ICDCIT'05 Proceedings of the Second international conference on Distributed Computing and Internet Technology
Mining special features to improve the performance of e-commerce product selection and resume processing

International Journal of Computational Science and Engineering
A cross datasets referring outlier detection model applied to suspicious financial transaction discrimination

WISI'06 Proceedings of the 2006 international conference on Intelligence and Security Informatics
Development and application of tender evaluation decision-making and risk early warning system for water projects based on KDD

Advances in Engineering Software
Outlier respecting points approximation

ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
Anomalistic sequence detection

International Journal of Intelligent Information and Database Systems
Detection of variable length anomalous subsequences in data streams

International Journal of Intelligent Information and Database Systems
Clustering by Sorting Potential Values (CSPV): A novel potential-based clustering method

Pattern Recognition
Distance-Based outlier detection on uncertain data of gaussian distribution

APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
An experimental comparison of real and artificial deception using a deception generation model

Decision Support Systems
Integrating community matching and outlier detection for mining evolutionary community outliers

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Security Through Collaboration and Trust in MANETs

Mobile Networks and Applications
On scales, salience and referential language use

AC'11 Proceedings of the 18th Amsterdam colloquim conference on Logic, Language and Meaning
Measuring stability of feature ranking techniques: a noise-based approach

International Journal of Business Intelligence and Data Mining
A minimum spanning tree-inspired clustering-based outlier detection technique

ICDM'12 Proceedings of the 12th Industrial conference on Advances in Data Mining: applications and theoretical aspects
Detecting ECG abnormalities via transductive transfer learning

Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
A survey on unsupervised outlier detection in high-dimensional numerical data

Statistical Analysis and Data Mining
An evolutionary approach for high dimensional attribute selection

International Journal of Intelligent Information and Database Systems
Outlier detection using centrality and center-proximity

Proceedings of the 21st ACM international conference on Information and knowledge management
Continuous adaptive outlier detection on distributed data streams

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
AUDIO: an integrity auditing framework of outlier-mining-as-a-service systems

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Discovering inappropriate billings with local density based outlier detection method

AusDM '09 Proceedings of the Eighth Australasian Data Mining Conference - Volume 101
Interactive data mining with 3D-parallel-coordinate-trees

Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Outlier ensembles: position paper

ACM SIGKDD Explorations Newsletter
Spin image revisited: fast candidate selection using outlier forest search

ACCV'12 Proceedings of the 11th international conference on Computer Vision - Volume 2
Subsampling for efficient and effective unsupervised outlier detection ensembles

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Enhancing one-class support vector machines for unsupervised anomaly detection

Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description
Combining co-clustering with noise detection for theme-based summarization

ACM Transactions on Speech and Language Processing (TSLP)
Clustering and outlier detection using isoperimetric number of trees

Pattern Recognition
Fast top-k distance-based outlier detection on uncertain data

WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
MEFES: An evolutionary proposal for the detection of exceptions in subgroup discovery. An application to Concentrating Photovoltaic Technology

Knowledge-Based Systems
A methodological overview on anomaly detection

DataTraffic Monitoring and Analysis
Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection

Data Mining and Knowledge Discovery
Hyperspherical cluster based distributed anomaly detection in wireless sensor networks

Journal of Parallel and Distributed Computing
Hybrid email spam detection model with negative selection algorithm and differential evolution

Engineering Applications of Artificial Intelligence
A reference based analysis framework for understanding anomaly detection techniques for symbolic sequences

Data Mining and Knowledge Discovery
A novelty detection machine and its application to bank failure prediction

Neurocomputing
Exploiting domain knowledge to detect outliers

Data Mining and Knowledge Discovery
Ensembles for unsupervised outlier detection: challenges and research questions a position paper

ACM SIGKDD Explorations Newsletter
A multivariate fuzzy system applied for outliers detection

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology

Quantified Score

Hi-index	0.02

Visualization

Abstract

In this paper, we propose a novel formulation for distance-based outliers that is based on the distance of a point from its kth nearest neighbor. We rank each point on the basis of its distance to its kth nearest neighbor and declare the top n points in this ranking to be outliers. In addition to developing relatively straightforward solutions to finding such outliers based on the classical nested-loop join and index join algorithms, we develop a highly efficient partition-based algorithm for mining outliers. This algorithm first partitions the input data set into disjoint subsets, and then prunes entire partitions as soon as it is determined that they cannot contain outliers. This results in substantial savings in computation. We present the results of an extensive experimental study on real-life and synthetic data sets. The results from a real-life NBA database highlight and reveal several expected and unexpected aspects of the database. The results from a study on synthetic data sets demonstrate that the partition-based algorithm scales well with respect to both data set size and data set dimensionality.