Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining
IEEE Transactions on Knowledge and Data Engineering
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Knowledge Discovery in Databases: An Attribute-Oriented Approach
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A unified approach for mining outliers
CASCON '97 Proceedings of the 1997 conference of the Centre for Advanced Studies on Collaborative research
On Digital Money and Card Technologies
On Digital Money and Card Technologies
Parallel Algorithms for Distance-Based and Density-Based Outliers
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Disk aware discord discovery: finding unusual time series in terabyte sized datasets
Knowledge and Information Systems
A comprehensive survey of numeric and symbolic outlier mining techniques
Intelligent Data Analysis
A distributed approach to detect outliers in very large data sets
EuroPar'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part I
Algorithms for speeding up distance-based outlier detection
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Accelerating outlier detection with uncertain data using graphics processors
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Hi-index | 0.00 |
Data mining is a new, important and fast growing database application. Outlier (exception) detection is one kind of data mining, which can be applied in a variety of areas like monitoring of credit card fraud and criminal activities in electronic commerce. With the ever-increasing size and attributes (dimensions) of database, previously proposed detection methods for two dimensions are no longer applicable. The time complexity of the Nested-Loop (NL) algorithm (Knorr and Ng, in Proc. 24th VLDB, 1998) is linear to the dimensionality but quadratic to the dataset size, inducing an unacceptable cost for large dataset.A more efficient version (ENL) and its parallel version (PENL) are introduced. In theory, the improvement of performance in PENL is linear to the number of processors, as shown in a performance comparison between ENL and PENL using Bulk Synchronization Parallel (BSP) model. The great improvement is further verified by experiments on a parallel computer system IBM 9076 SP2. The results show that it is a very good choice to mine outliers in a cluster of workstations with a low-cost interconnected by a commodity communication network.