Finding key attribute subset in dataset for outlier detection

  • Authors:
  • Peng Yang;Qingsheng Zhu

  • Affiliations:
  • College of Computer Science, Chongqing University, Chongqing 400044, China;College of Computer Science, Chongqing University, Chongqing 400044, China

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Detection of outlier from high dimensional dataset have found important applications in many fields, yet the unexpected time consumption is likely to hinder its practical use. Thus, it makes sense to build an efficient method for finding meaningful outliers and analyzing their intentional knowledge. In this paper, we utilize the concept of rough set to construct a method for outlying reduction, based on an outlier detection and analysis system. By defining outlying partition similarity, we can mine outliers on the key attribute subset rather than on the full dimensional attribute set of dataset, as long as the similarity between outlying partitions produced on them is large enough. For this purpose, we propose a novel method for finding the key attribute subset in dataset, which starts by seeking all outliers on the full attribute set, and then searches through all outlying attribute subsets for these points. After that, it turns out to be able to determine the key attribute subset in accordance with the similarity between outlying partitions. By experiments, we show that our method allows more efficient seeking of key attribute subset than the previous methods, thereby improving the feasibility of outlier detection.