DRILA: a distributed relational inductive learning algorithm

  • Authors:
  • Saleh M. Abu-Soud;Ali Al-Ibrahim

  • Affiliations:
  • Computer Science Department, New York Institute of Technology, Amman, Jordan;Faculty of Information Systems, The Arab Academy for Banking and Financial Sciences, Amman, Jordan

  • Venue:
  • WSEAS Transactions on Computers
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a new rule discovery algorithm called Distributed Relational Inductive Learning DRILA, which has been developed as part of ongoing research of the Inductive Learning Algorithm (ILA) [11], and its extension ILA2 [12] which were built to learn from a single table, and the Relational Inductive Learning Algorithm (RILA) [13], [14] which was developed to learn from a group of interrelated tables, i.e. a centralized database. DRILA allows discovery of distributed relational rules using data from distributed relational databases. It consists of a collection of sites, each of which maintains a local database system, or a collection of multiple, logically interrelated databases distributed over a computer network. The basic assumption of the algorithm is that objects to be analyzed are stored in a set of tables that are distributed over many locations. Distributed relational rules discovered would either be used in predicting an unknown object attribute value, or they can be used to extract the hidden relationship between the objects' attribute values. The rule discovery algorithm, developed, was designed to use data available from many locations (sites), any possible 'connected' schema at each location where tables concerned are connected by foreign keys. In order to have a reasonable performance, the 'hypotheses search' algorithm was implemented to allow construction of new hypotheses by refining previously constructed hypotheses, thereby avoiding the work of re- computing. Unlike many other relational learning algorithms, the DRILA algorithm does not need its own copy of distributed relational data to process it. This is important in terms of the scalability and usability of the distributed relational data mining solution that has been developed. The architecture proposed can be used as a framework to upgrade other propositional learning algorithms to relational learning.