Computational geometry: an introduction
Computational geometry: an introduction
Applied multivariate statistical analysis
Applied multivariate statistical analysis
The design and analysis of spatial data structures
The design and analysis of spatial data structures
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Computing depth contours of bivariate point clouds
Computational Statistics & Data Analysis - Special issue on classification
On the analysis of indexing schemes
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Multidimensional binary search trees used for associative searching
Communications of the ACM
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Contour Tracking by Stochastic Propagation of Conditional Density
ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume I - Volume I
Discovery-Driven Exploration of OLAP Data Cubes
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
Mining Surprising Patterns Using Temporal Description Length
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
3-D model-based tracking of humans in action: a multi-view approach
CVPR '96 Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96)
Background Modeling for Segmentation of Video-Rate Stereo Sequences
CVPR '98 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Outlier Detection Using Replicator Neural Networks
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Data Squashing for Speeding Up Boosting-Based Outlier Detection
ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Improving Classification by Removing or Relabeling Mislabeled Instances
ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
An Adaptive Recommendation System without Explicit Acquisition of User Relevance Feedback
Distributed and Parallel Databases
Outlier Detection Algorithms in Data Mining Systems
Programming and Computing Software
Identifying and Handling Mislabelled Instances
Journal of Intelligent Information Systems
Detecting Interesting Exceptions from Medical Test Data with Visual Summarization
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Novelty detection: a review—part 1: statistical approaches
Signal Processing
Detecting pattern-based outliers
Pattern Recognition Letters
Outlier Mining in Large High-Dimensional Data Sets
IEEE Transactions on Knowledge and Data Engineering
Classification and knowledge discovery in protein databases
Journal of Biomedical Informatics - Special issue: Biomedical machine learning
Detection and prediction of distance-based outliers
Proceedings of the 2005 ACM symposium on Applied computing
A rank-by-feature framework for interactive exploration of multidimensional data
Information Visualization
HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Parallel Algorithms for Distance-Based and Density-Based Outliers
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Distance-Based Detection and Prediction of Outliers
IEEE Transactions on Knowledge and Data Engineering
Enhancing Data Analysis with Noise Removal
IEEE Transactions on Knowledge and Data Engineering
Fast Distributed Outlier Detection in Mixed-Attribute Data Sets
Data Mining and Knowledge Discovery
Interestingness measures for data mining: A survey
ACM Computing Surveys (CSUR)
Semi-supervised outlier detection
Proceedings of the 2006 ACM symposium on Applied computing
Mining distance-based outliers from large databases in any metric space
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Detecting outliers in interval data
Proceedings of the 44th annual Southeast regional conference
IEEE Transactions on Knowledge and Data Engineering
A trend pattern assessment approach to microarray gene expression profiling data analysis
Pattern Recognition Letters
From outliers to prototypes: Ordering data
Neurocomputing
Statistical change detection for multi-dimensional data
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A trimmed mean approach to finding spatial outliers
Intelligent Data Analysis
Outlier detection by logic programming
ACM Transactions on Computational Logic (TOCL)
Very efficient mining of distance-based outliers
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Detecting distance-based outliers in streams of data
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A Bayesian method for guessing the extreme values in a data set?
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Managing discoveries in the visual analytics process
ACM SIGKDD Explorations Newsletter - Special issue on visual analytics
Local anomaly detection for mobile network monitoring
Information Sciences: an International Journal
Outlier detection using default reasoning
Artificial Intelligence
Outlier Detection: An Approximate Reasoning Approach
RSEISP '07 Proceedings of the international conference on Rough Sets and Intelligent Systems Paradigms
Detecting Current Outliers: Continuous Outlier Detection over Time-Series Data Streams
DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Efficiently finding unusual shapes in large image databases
Data Mining and Knowledge Discovery
Quality-driven information filtering using the WIQA policy framework
Web Semantics: Science, Services and Agents on the World Wide Web
DOLPHIN: An efficient algorithm for mining distance-based outliers in very large datasets
ACM Transactions on Knowledge Discovery from Data (TKDD)
Some issues about outlier detection in rough set theory
Expert Systems with Applications: An International Journal
Detecting outlying properties of exceptional objects
ACM Transactions on Database Systems (TODS)
Domain independent data discrepancy detection using ensemble learning
ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
Hiding distinguished ones into crowd: privacy-preserving publishing data with outliers
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Minimum spanning tree based one-class classifier
Neurocomputing
Guessing the extreme values in a data set: a Bayesian method and its applications
The VLDB Journal — The International Journal on Very Large Data Bases
Expert Systems with Applications: An International Journal
ACM Computing Surveys (CSUR)
Efficient anomaly monitoring over moving object trajectory streams
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
What Can Formal Concept Analysis Do for Data Warehouses?
ICFCA '09 Proceedings of the 7th International Conference on Formal Concept Analysis
Journal of Data and Information Quality (JDIQ)
Anomaly detection and spatio-temporal analysis of global climate system
Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
A comprehensive survey of numeric and symbolic outlier mining techniques
Intelligent Data Analysis
Detection of Database Intrusion Using a Two-Stage Fuzzy System
ISC '09 Proceedings of the 12th International Conference on Information Security
RE2-CD: Robust and Energy Efficient Cut Detection in Wireless Sensor Networks
WASA '09 Proceedings of the 4th International Conference on Wireless Algorithms, Systems, and Applications
K-means clustering versus validation measures: a data-distribution perspective
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Multi-scale temporal segmentation and outlier detection in sensor networks
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Outlier mining based Automatic Incident Detection on urban arterial road
Mobility '09 Proceedings of the 6th International Conference on Mobile Technology, Application & Systems
A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes
Data Mining and Knowledge Discovery
Distance-based outlier queries in data streams: the novel task and algorithms
Data Mining and Knowledge Discovery
TOD: Temporal outlier detection by using quasi-functional temporal dependencies
Data & Knowledge Engineering
Correlation-based detection of attribute outliers
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
An efficient histogram method for outlier detection
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
RSKT'08 Proceedings of the 3rd international conference on Rough sets and knowledge technology
Semi-supervised outlier detection based on fuzzy rough C-means clustering
Mathematics and Computers in Simulation
An information entropy-based approach to outlier detection in rough sets
Expert Systems with Applications: An International Journal
Cluster-based congestion outlier detection method on trajectory data
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
Ensembles of pre-processing techniques for noise detection in gene expression data
ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
A resistant learning procedure for coping with outliers
Annals of Mathematics and Artificial Intelligence
Detecting outliers on arbitrary data streams using anytime approaches
Proceedings of the First International Workshop on Novel Data Stream Pattern Mining Techniques
On community outliers and their efficient detection in information networks
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Outlier detection in transactional data
Intelligent Data Analysis
Neighborhood outlier detection
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Fuzzy clustering-based approach for outlier detection
ACE'10 Proceedings of the 9th WSEAS international conference on Applications of computer engineering
New outlier detection method based on fuzzy clustering
WSEAS Transactions on Information Science and Applications
Inter-image outliers and their application to image classification
Pattern Recognition
Soft fuzzy rough sets for robust feature evaluation and selection
Information Sciences: an International Journal
Journal of Intelligent Information Systems
Outlier detection and visualization of large datasets
Proceedings of the International Conference & Workshop on Emerging Trends in Technology
Detecting outlier sections in us congressional legislation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Robust fuzzy rough classifiers
Fuzzy Sets and Systems
iBAT: detecting anomalous taxi trajectories from GPS traces
Proceedings of the 13th international conference on Ubiquitous computing
A hybrid approach to outlier detection based on boundary region
Pattern Recognition Letters
A novel outlier detection method for spatio-tempral trajectory data
ICHIT'11 Proceedings of the 5th international conference on Convergence and hybrid information technology
Anomaly detection in information streams without prior domain knowledge
IBM Journal of Research and Development
A fast greedy algorithm for outlier mining
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Outlier detection in relational data: A case study in geographical information systems
Expert Systems with Applications: An International Journal
Simple instance selection for bankruptcy prediction
Knowledge-Based Systems
Outlier detection using rough set theory
RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
Isolation-Based Anomaly Detection
ACM Transactions on Knowledge Discovery from Data (TKDD)
An optimization model for outlier detection in categorical data
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
Similarity kernels for nearest neighbor-based outlier detection
IDA'10 Proceedings of the 9th international conference on Advances in Intelligent Data Analysis
SMART: Stream Monitoring enterprise Activities by RFID Tags
Information Sciences: an International Journal
WISI'06 Proceedings of the 2006 international conference on Intelligence and Security Informatics
WISI'06 Proceedings of the 2006 international conference on Intelligence and Security Informatics
Advances in Engineering Software
Distance-Based outlier detection on uncertain data of gaussian distribution
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
AnyOut: anytime outlier detection on streaming data
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Data & Knowledge Engineering
Integrating community matching and outlier detection for mining evolutionary community outliers
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised ensemble learning for mining top-n outliers
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
A minimum spanning tree-inspired clustering-based outlier detection technique
ICDM'12 Proceedings of the 12th Industrial conference on Advances in Data Mining: applications and theoretical aspects
Experimental comparison of DWT and DFT for trajectory representation
IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
A survey on unsupervised outlier detection in high-dimensional numerical data
Statistical Analysis and Data Mining
Towards intensional answers to OLAP queries for analytical sessions
Proceedings of the fifteenth international workshop on Data warehousing and OLAP
Outlier detection using centrality and center-proximity
Proceedings of the 21st ACM international conference on Information and knowledge management
MultiAspectForensics: mining large heterogeneous networks using tensor
International Journal of Web Engineering and Technology
Approximate document outlier detection using random spectral projection
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Two-stage database intrusion detection by combining multiple evidence and belief update
Information Systems Frontiers
Combining co-clustering with noise detection for theme-based summarization
ACM Transactions on Speech and Language Processing (TSLP)
Fast top-k distance-based outlier detection on uncertain data
WAIM'13 Proceedings of the 14th international conference on Web-Age Information Management
SVOIS: Support Vector Oriented Instance Selection for text classification
Information Systems
Data Mining and Knowledge Discovery
Hi-index | 0.01 |
This paper deals with finding outliers (exceptions) in large, multidimensional datasets. The identification of outliers can lead to the discovery of truly unexpected knowledge in areas such as electronic commerce, credit card fraud, and even the analysis of performance statistics of professional athletes. Existing methods that we have seen for finding outliers can only deal efficiently with two dimensions/attributes of a dataset. In this paper, we study the notion of DB (distance-based) outliers. Specifically, we show that (i) outlier detection can be done efficiently for large datasets, and for k-dimensional datasets with large values of k (e.g., $k \ge 5$); and (ii), outlier detection is a meaningful and important knowledge discovery task.First, we present two simple algorithms, both having a complexity of $O(k \: N^2)$, k being the dimensionality and N being the number of objects in the dataset. These algorithms readily support datasets with many more than two attributes. Second, we present an optimized cell-based algorithm that has a complexity that is linear with respect to N, but exponential with respect to k. We provide experimental results indicating that this algorithm significantly outperforms the two simple algorithms for $k \leq 4$. Third, for datasets that are mainly disk-resident, we present another version of the cell-based algorithm that guarantees at most three passes over a dataset. Again, experimental results show that this algorithm is by far the best for $k \leq 4$. Finally, we discuss our work on three real-life applications, including one on spatio-temporal data (e.g., a video surveillance application), in order to confirm the relevance and broad applicability of DB outliers.