Detecting outlying subspaces for high-dimensional data: the new task, algorithms, and performance

Authors:
Ji Zhang;Hai Wang
Affiliations:
Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada;Sobey School of Business, Saint Mary's University, Halifax, Nova Scotia, Canada
Venue:
Knowledge and Information Systems
Year:
2006

Citing 0
Cited 11

Detecting outlying properties of exceptional objects

ACM Transactions on Database Systems (TODS)
Anomaly detection: A survey

ACM Computing Surveys (CSUR)
RE2-CD: Robust and Energy Efficient Cut Detection in Wireless Sensor Networks

WASA '09 Proceedings of the 4th International Conference on Wireless Algorithms, Systems, and Applications
Detecting Projected Outliers in High-Dimensional Data Streams

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Mining Outliers in Correlated Subspaces for High Dimensional Data Sets

Fundamenta Informaticae - Intelligent Data Analysis in Granular Computing
Fuzzy clustering-based approach for outlier detection

ACE'10 Proceedings of the 9th WSEAS international conference on Applications of computer engineering
New outlier detection method based on fuzzy clustering

WSEAS Transactions on Information Science and Applications
Finding key attribute subset in dataset for outlier detection

Knowledge-Based Systems
Towards robustness and energy efficiency of cut detection in wireless sensor networks

Ad Hoc Networks
Mining special features to improve the performance of e-commerce product selection and resume processing

International Journal of Computational Science and Engineering
Review: A review of novelty detection

Signal Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we identify a new task for studying the outlying degree (OD) of high-dimensional data, i.e. finding the subspaces (subsets of features) in which the given points are outliers, which are called their outlying subspaces. Since the state-of-the-art outlier detection techniques fail to handle this new problem, we propose a novel detection algorithm, called High-Dimension Outlying subspace Detection (HighDOD), to detect the outlying subspaces of high-dimensional data efficiently. The intuitive idea of HighDOD is that we measure the OD of the point using the sum of distances between this point and itsknearest neighbors. Two heuristic pruning strategies are proposed to realize fast pruning in the subspace search and an efficient dynamic subspace search method with a sample-based learning process has been implemented. Experimental results show that HighDOD is efficient and outperforms other searching alternatives such as the naive top–down, bottom–up and random search methods, and the existing outlier detection methods cannot fulfill this new task effectively.