Information Processing Letters
KBG: a generator of knowledge bases
EWSL-91 Proceedings of the European working session on learning on Machine learning
Bayesian inductive logic programming
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Learning Boolean concepts in the presence of many irrelevant features
Artificial Intelligence
Machine Discovery of Protein Motifs
Machine Learning - Special issue on applications in molecular biology
Solving the multiple instance problem with axis-parallel rectangles
Artificial Intelligence
Logical settings for concept-learning
Artificial Intelligence
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
A Note on Learning from Multiple-Instance Examples
Machine Learning - Special issue on the ninth annual conference on computational theory (COLT '96)
Knowledge-Based Learning in Exploratory Science: Learning Rules to Predict Rodent Carcinogenicity
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Feature Extraction, Construction and Selection: A Data Mining Perspective
Feature Extraction, Construction and Selection: A Data Mining Perspective
Inductive Logic Programming: Techniques and Applications
Inductive Logic Programming: Techniques and Applications
Learning Logical Definitions from Relations
Machine Learning
A Practical Approach to Feature Selection
ML '92 Proceedings of the Ninth International Workshop on Machine Learning
Learning Structurally Indeterminate Clauses
ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
Tractable induction and classification in first order logic via stochastic matching
IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Adapting RBF Neural Networks to Multi-Instance Learning
Neural Processing Letters
Solving multi-instance problems with classifier ensemble based on constructive clustering
Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
Multi-instance clustering with applications to multi-instance prediction
Applied Intelligence
G3P-MI: A genetic programming algorithm for multiple instance learning
Information Sciences: an International Journal
Informative variables selection for multi-relational supervised learning
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Multi-instance multi-label learning
Artificial Intelligence
A study of applying dimensionality reduction to restrict the size of a hypothesis space
ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
Reducing the size of databases for multirelational classification: a subgraph-based approach
Journal of Intelligent Information Systems
Hi-index | 0.01 |
Attribute-value based representations, standard in today's data mining systems, have a limited expressiveness. Inductive Logic Programming provides an interesting alternative, particularly for learning from structured examples whose parts, each with its own attributes, are related to each other by means of first-order predicates. Several subsets of first-order logic (FOL) with different expressive power have been proposed in Inductive Logic Programming (ILP). The challenge lies in the fact that the more expressive the subset of FOL the learner works with, the more critical the dimensionality of the learning task. The Datalog language is expressive enough to represent realistic learning problems when data is given directly in a relational database, making it a suitable tool for data mining. Consequently, it is important to elaborate techniques that will dynamically decrease the dimensionality of learning tasks expressed in Datalog, just as Feature Subset Selection (FSS) techniques do it in attribute-value learning. The idea of re-using these techniques in ILP runs immediately into a problem as ILP examples have variable size and do not share the same set of literals. We propose here the first paradigm that brings Feature Subset Selection to the level of ILP, in languages at least as expressive as Datalog. The main idea is to first perform a change of representation, which approximates the original relational problem by a multi-instance problem. The representation obtained as the result is suitable for FSS techniques which we adapted from attribute-value learning by taking into account some of the characteristics of the data due to the change of representation. We present the simple FSS proposed for the task, the requisite change of representation, and the entire method combining those two algorithms. The method acts as a filter, preprocessing the relational data, prior to the model building, which outputs relational examples with empirically relevant literals. We discuss experiments in which the method was successfully applied to two real-world domains.