An optimal algorithm for approximate nearest neighbor searching fixed dimensions
Journal of the ACM (JACM)
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Parallel Algorithms for Distance-Based and Density-Based Outliers
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Using GPUs for Machine Learning Algorithms
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
An Empirical Study for the Detection of Corporate Financial Anomaly Using Outlier Mining Techniques
ICCIT '07 Proceedings of the 2007 International Conference on Convergence Information Technology
Reputation-based framework for high integrity sensor networks
ACM Transactions on Sensor Networks (TOSN)
Fast support vector machine training and classification on graphics processors
Proceedings of the 25th international conference on Machine learning
Outlier Detection Algorithms in Data Mining
IITA '08 Proceedings of the 2008 Second International Symposium on Intelligent Information Technology Application - Volume 01
Data transformations enabling loop vectorization on multithreaded data parallel architectures
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Multi GPU implementation of iterative tomographic reconstruction algorithms
ISBI'09 Proceedings of the Sixth IEEE international conference on Symposium on Biomedical Imaging: From Nano to Macro
Accelerating outlier detection with uncertain data using graphics processors
PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Interleaving and lock-step semantics for analysis and verification of GPU kernels
ESOP'13 Proceedings of the 22nd European conference on Programming Languages and Systems
Hi-index | 0.00 |
The Local Outlier Factor (LOF) is a very powerful anomaly detection method available in machine learning and classification. The algorithm defines the notion of local outlier in which the degree to which an object is outlying is dependent on the density of its local neighborhood, and each object can be assigned an LOF which represents the likelihood of that object being an outlier. Although this concept of a local outlier is a useful one, the computation of LOF values for every data object requires a large number of k-nearest neighbor queries -- this overhead can limit the use of LOF due to the computational overhead involved. Due to the growing popularity of Graphics Processing Units (GPU) in general-purpose computing domains, and equipped with a high-level programming language designed specifically for general-purpose applications (e.g., CUDA), we look to apply this parallel computing approach to accelerate LOF. In this paper we explore how to utilize a CUDA-based GPU implementation of the k-nearest neighbor algorithm to accelerate LOF classification. We achieve more than a 100X speedup over a multi-threaded dual-core CPU implementation. We also consider the impact of input data set size, the neighborhood size (i.e., the value of k) and the feature space dimension, and report on their impact on execution time.