Parallel ILP for distributed-memory architectures

Authors:
Nuno A. Fonseca;Ashwin Srinivasan;Fernando Silva;Rui Camacho
Affiliations:
Instituto de Biologia Molecular e Celular (IBMC) & CRACS, Universidade do Porto, Porto, Portugal 4169-007;IBM India Research Laboratory, Block 1, Indian Institute of Technology, New Delhi, India 110 016 and Department of CSE & Centre for Health Informatics, University of New South Wales, Sydney, Austr ...;CRACS & Faculdade de Ciêências, Universidade do Porto, Porto, Portugal 4169-007;LIAAD & Faculdade de Engenharia, Universidade do Porto, Porto, Portugal 4200-465
Venue:
Machine Learning
Year:
2009

Citing 23
Cited 2

Combinatorial optimization: algorithms and complexity

Combinatorial optimization: algorithms and complexity
Inductive logic programming: derivations, successes and shortcomings

ACM SIGART Bulletin
Parallel and sequential algorithms for data mining using inductive logic

Knowledge and Information Systems
Relational rule induction with CPROGO14.4: a tutorial introductuon

Relational Data Mining
Predicting Chemical Parameters of River Water Quality from Bioindicator Data

Applied Intelligence
A Study of Two Sampling Methods for Analyzing Large Datasets with ILP

Data Mining and Knowledge Discovery
Learning Logical Definitions from Relations

Machine Learning
FOIL: A Midterm Report

ECML '93 Proceedings of the European Conference on Machine Learning
Data Mining the Yeast Genome in a Lazy Functional Language

PADL '03 Proceedings of the 5th International Symposium on Practical Aspects of Declarative Languages
Carcinogenesis Predictions Using ILP

ILP '97 Proceedings of the 7th International Workshop on Inductive Logic Programming
Part-of-Speech Tagging Using Progol

ILP '97 Proceedings of the 7th International Workshop on Inductive Logic Programming
MPI: A Message-Passing Interface Standard

MPI: A Message-Passing Interface Standard
Accelerating the Drug Design Process through Parallel Inductive Logic Programming Data Mining

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Relational learning as search in a critical region

The Journal of Machine Learning Research
Query transformations for improving the efficiency of ilp systems

The Journal of Machine Learning Research
Applying inductive logic programming to predicting gene function

AI Magazine
Randomised restarted search in ILP

Machine Learning
The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver

IEEE Transactions on Computers
Tractable induction and classification in first order logic via stochastic matching

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Lattice-search runtime distributions may be heavy-tailed

ILP'02 Proceedings of the 12th international conference on Inductive logic programming
April: an inductive logic programming system

JELIA'06 Proceedings of the 10th European conference on Logics in Artificial Intelligence
On applying tabling to inductive logic programming

ECML'05 Proceedings of the 16th European conference on Machine Learning
A study of applying dimensionality reduction to restrict the size of a hypothesis space

ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming

Threads and or-parallelism unified

Theory and Practice of Logic Programming
A rule-based system for end-user e-mail annotations

Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

The growth of machine-generated relational databases, both in the sciences and in industry, is rapidly outpacing our ability to extract useful information from them by manual means. This has brought into focus machine learning techniques like Inductive Logic Programming (ILP) that are able to extract human-comprehensible models for complex relational data. The price to pay is that ILP techniques are not efficient: they can be seen as performing a form of discrete optimisation, which is known to be computationally hard; and the complexity is usually some super-linear function of the number of examples. While little can be done to alter the theoretical bounds on the worst-case complexity of ILP systems, some practical gains may follow from the use of multiple processors. In this paper we survey the state-of-the-art on parallel ILP. We implement several parallel algorithms and study their performance using some standard benchmarks. The principal findings of interest are these: (1) of the techniques investigated, one that simply constructs models in parallel on each processor using a subset of data and then combines the models into a single one, yields the best results; and (2) sequential (approximate) ILP algorithms based on randomized searches have lower execution times than (exact) parallel algorithms, without sacrificing the quality of the solutions found.