AD-Miner: A new incremental method for discovery of minimal approximate dependencies using logical operations

  • Authors:
  • S. M. Fakhrahmad;M. H. Sadreddini;M. Zolghadri Jahromi

  • Affiliations:
  • Department of Computer Science & Engineering, School of Engineering, Shiraz University, Shiraz, Iran;Department of Computer Science & Engineering, School of Engineering, Shiraz University, Shiraz, Iran;Department of Computer Science & Engineering, School of Engineering, Shiraz University, Shiraz, Iran

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Discovery of possible relations between attribute values in a relational database (i.e., functional dependencies) is an important issue in the field of data mining and knowledge discovery. Many search techniques have been proposed to discover classical and extended functional dependencies; but even the most efficient solutions do not have an acceptable performance in the case of large relation instances. In addition, most of the proposed algorithms assume that the database is static and thus database updates require re-scanning of the entire data repeatedly. In this paper, we propose a new incremental method, AD-Miner, to discover Approximate Dependencies (ADs). The main part of our work is based on logical operations which aim to reduce the computational complexity. The method is incremental and thus avoids re-scans of database when a set of tuples is added to the relation. Our experimental results indicate that our method is more efficient than FastFDs [22] which is one of the most efficient algorithms for mining of perfect dependencies. Furthermore, we have shown that the complexity of our method is lower than major incremental methods namely partitioning and Pair-wise comparison methods. In addition, our method has the extra advantage of marking the index of the tuples that violate a dependency. This feature can be used to find the exceptional cases that are inconsistent with the rest of the data. We have implemented AD-Miner and tested it on several benchmarks and synthetic data.