Finding key attribute subset in dataset for outlier detection

Authors:
Peng Yang;Qingsheng Zhu
Affiliations:
College of Computer Science, Chongqing University, Chongqing 400044, China;College of Computer Science, Chongqing University, Chongqing 400044, China
Venue:
Knowledge-Based Systems
Year:
2011

Citing 19
Cited 8

LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Algorithms for Mining Distance-Based Outliers in Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Outlier Mining in Large High-Dimensional Data Sets

IEEE Transactions on Knowledge and Data Engineering
Information-preserving hybrid data reduction based on fuzzy-rough techniques

Pattern Recognition Letters
Detecting outlying subspaces for high-dimensional data: the new task, algorithms, and performance

Knowledge and Information Systems
A rough sets based characteristic relation approach for dynamic attribute generalization in data mining

Knowledge-Based Systems
A genetic approach for efficient outlier detection in projected space

Pattern Recognition
Fast mining of distance-based outliers in high-dimensional datasets

Data Mining and Knowledge Discovery
Attribute reduction in decision-theoretic rough set models

Information Sciences: an International Journal
Projected outlier detection in high-dimensional mixed-attributes data set

Expert Systems with Applications: An International Journal
Detecting outlying properties of exceptional objects

ACM Transactions on Database Systems (TODS)
A fast approach to attribute reduction in incomplete decision systems with tolerance relation-based rough sets

Information Sciences: an International Journal
Discovering unexpected documents in corpora

Knowledge-Based Systems
Variable-precision dominance-based rough set approach and attribute reduction

International Journal of Approximate Reasoning
A comparison of outlier detection algorithms for ITS data

Expert Systems with Applications: An International Journal
Semi-supervised outlier detection based on fuzzy rough C-means clustering

Mathematics and Computers in Simulation
Attributes Reduction Using Fuzzy Rough Sets

IEEE Transactions on Fuzzy Systems
The distribution of test statistics for outlier detection in heavy-tailed samples

Mathematical and Computer Modelling: An International Journal

Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking

Knowledge-Based Systems
A dissimilarity measure for the k-Modes clustering algorithm

Knowledge-Based Systems
Simple instance selection for bankruptcy prediction

Knowledge-Based Systems
Uncertainty measurement for interval-valued decision systems based on extended conditional entropy

Knowledge-Based Systems
An ensemble design of intrusion detection system for handling uncertainty using Neutrosophic Logic Classifier

Knowledge-Based Systems
Development and application of tender evaluation decision-making and risk early warning system for water projects based on KDD

Advances in Engineering Software
A novel soft set approach in selecting clustering attribute

Knowledge-Based Systems
PCA-based high-dimensional noisy data clustering via control of decision errors

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Detection of outlier from high dimensional dataset have found important applications in many fields, yet the unexpected time consumption is likely to hinder its practical use. Thus, it makes sense to build an efficient method for finding meaningful outliers and analyzing their intentional knowledge. In this paper, we utilize the concept of rough set to construct a method for outlying reduction, based on an outlier detection and analysis system. By defining outlying partition similarity, we can mine outliers on the key attribute subset rather than on the full dimensional attribute set of dataset, as long as the similarity between outlying partitions produced on them is large enough. For this purpose, we propose a novel method for finding the key attribute subset in dataset, which starts by seeking all outliers on the full attribute set, and then searches through all outlying attribute subsets for these points. After that, it turns out to be able to determine the key attribute subset in accordance with the similarity between outlying partitions. By experiments, we show that our method allows more efficient seeking of key attribute subset than the previous methods, thereby improving the feasibility of outlier detection.