Comparison of graph-based and logic-based multi-relational data mining

Authors:
Nikhil S. Ketkar;Lawrence B. Holder;Diane J. Cook
Affiliations:
University of Texas at Arlington;University of Texas at Arlington;University of Texas at Arlington
Venue:
ACM SIGKDD Explorations Newsletter
Year:
2005

Citing 18
Cited 2

Inductive logic programming

New Generation Computing - Selected papers from the international workshop on algorithmic learning theory,1990
A database perspective on knowledge discovery

Communications of the ACM
Theories for mutagenicity: a study in first-order and feature-based induction

Artificial Intelligence - Special volume on empirical methods
Top-down induction of first-order logical decision trees

Artificial Intelligence
Stochastic Complexity in Statistical Inquiry Theory

Stochastic Complexity in Statistical Inquiry Theory
Discovery of frequent DATALOG patterns

Data Mining and Knowledge Discovery
Learning Logical Definitions from Relations

Machine Learning
Diffusion Kernels on Graphs and Other Discrete Input Spaces

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
An Experimental Evaluation of Coevolutive Concept Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Inductive Constraint Logic

ALT '95 Proceedings of the 6th International Conference on Algorithmic Learning Theory
Discovering Frequent Geometric Subgraphs

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
CloseGraph: mining closed frequent graph patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-relational data mining: an introduction

ACM SIGKDD Explorations Newsletter
Link mining: a new data mining challenge

ACM SIGKDD Explorations Newsletter
A quickstart in frequent structure mining can make a difference

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
An Efficient Algorithm for Discovering Frequent Subgraphs

IEEE Transactions on Knowledge and Data Engineering

Link mining: a survey

ACM SIGKDD Explorations Newsletter
Comparing and evaluating approaches to probabilistic reasoning: theory, implementation, and applications

Transactions on Large-Scale Data- and Knowledge-Centered Systems VI

Quantified Score

Hi-index	0.01

Visualization

Abstract

We perform an experimental comparison of the graph-based multi-relational data mining system, Subdue, and the inductive logic programming system, CProgol, on the Mutagenesis dataset and various artificially generated Bongard problems. Experimental results indicate that Subdue can significantly outperform CProgol while discovering structurally large multi-relational concepts. It is also observed that CProgol is better at learning semantically complicated concepts and it tends to use background knowledge more effectively than Subdue. An analysis of the results indicates that the differences in the performance of the systems are a result of the difference in the expressiveness of the logic-based and the graph-based representations. The ability of graph-based systems to learn structurally large concepts comes from the use of a weaker representation whose expressiveness is intermediate between propositional and first-order logic. The use of this weaker representation is advantageous while learning structurally large concepts but it limits the learning of semantically complicated concepts and the utilization background knowledge.