Can ILP be applied to large datasets?

Authors:
Hiroaki Watanabe;Stephen Muggleton
Affiliations:
Imperial College London, London, UK;Imperial College London, London, UK
Venue:
ILP'09 Proceedings of the 19th international conference on Inductive logic programming
Year:
2009

Citing 1
Cited 4

Foundations of Inductive Logic Programming

Foundations of Inductive Logic Programming

Class expression learning for ontology engineering

Web Semantics: Science, Services and Agents on the World Wide Web
Introduction to linked data and its lifecycle on the web

RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Projection-Based PILP: computational learning theory with empirical results

ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming
Introduction to linked data and its lifecycle on the web

RW'13 Proceedings of the 9th international conference on Reasoning Web: semantic technologies for intelligent data access

Quantified Score

Hi-index	0.00

Visualization

Abstract

There exist large data in science and business. Existing ILP systems cannot be applied effectively for data sets with 10000 data points. In this paper, we consider a technique which can be used to apply for more than 10000 data by simplifying it. Our approach is called Approximative Generalisation and can compress several data points into one example. In case that the original examples are mixture of positive and negative examples, the resulting example is ascribed in probability values representing proportion of positiveness. Our longer term aim is to apply on large Chess endgame database to allow well controlled evaluations of the technique. In this paper we start by choosing a simple game of Noughts and Crosses and we apply mini-max backup algorithm to obtain database of examples. These outcomes are compacted using our approach and empirical results show this has advantage both in accuracy and speed. In further work we hope to apply the approach to large database of both natural and artificial domains.