Outlier detection based on rough sets theory

  • Authors:
  • Faizah Shaari;Azuraliza Abu Bakar;Abdul Razak Hamdan

  • Affiliations:
  • Center for Artificial Intelligence Technology (CAIT), Faculty of Technology and Information Science, National University of Malaysia, 43600 Bangi, Selangor, Malaysia;(Correspd. E-mail: aab@ftsm.ukm.my) Center for Artificial Intelligence Technology (CAIT), Faculty of Technology and Information Science, National University of Malaysia, 43600 Bangi, Selangor, Mal ...;Center for Artificial Intelligence Technology (CAIT), Faculty of Technology and Information Science, National University of Malaysia, 43600 Bangi, Selangor, Malaysia

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

An outlier in a dataset is a point or a class of points that is considerably dissimilar to or inconsistent with the remainder of the data. Detection of outliers is important for many applications and has always attracted attention among data mining research community. In this paper, a new method in detecting outlier based on Rough Sets Theory is proposed. The main concept of using the Rough Sets for outlier detection is to discover Non-Reduct from the information system (IS). Non-Reduct is a set of attributes from IS that may contain outliers. It is discovered through the computation of Non-Reduct by defining Indiscernibility matrix modulo (iDMM D) and Indiscernibility function modulo (iDFM D). A measurement called RSetOF (Rough Set Outlier Factor Value) is hereby defined to identify and detect outlier objects. Extensive experiments were conducted where ten benchmark datasets were tested with the proposed method. To evaluate the effectiveness of performance of the proposed method, RSetAlg is compared to the Frequent Pattern (FindFPOF) method. The experimental result reveals that the approach utilised is a good outlier detection method compared to FindFPOF method. Thus, this proposed method has formed a novel and competitive method in outlier detection.