Some issues about outlier detection in rough set theory

  • Authors:
  • Feng Jiang;Yuefei Sui;Cungen Cao

  • Affiliations:
  • College of Information and Science Technology, Qingdao University of Science and Technology, Qingdao 266061, PR China;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, PR China;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, PR China

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 12.06

Visualization

Abstract

''One person's noise is another person's signal'' (Knorr, E., Ng, R. (1998). Algorithms for mining distance-based outliers in large datasets. In Proceedings of the 24th VLDB conference, New York (pp. 392-403)). In recent years, much attention has been given to the problem of outlier detection, whose aim is to detect outliers - objects which behave in an unexpected way or have abnormal properties. Detecting such outliers is important for many applications such as criminal activities in electronic commerce, computer intrusion attacks, terrorist threats, agricultural pest infestations, etc. And outlier detection is critically important in the information-based society. In this paper, we discuss some issues about outlier detection in rough set theory which emerged about 20 years ago, and is nowadays a rapidly developing branch of artificial intelligence and soft computing. First, we propose a novel definition of outliers in information systems of rough set theory -sequence-based outliers. An algorithm to find such outliers in rough set theory is also given. The effectiveness of sequence-based method for outlier detection is demonstrated on two publicly available databases. Second, we introduce traditional distance-based outlier detection to rough set theory and discuss the definitions of distance metrics for distance-based outlier detection in rough set theory.