A database perspective on knowledge discovery
Communications of the ACM
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Levelwise Search and Borders of Theories in KnowledgeDiscovery
Data Mining and Knowledge Discovery
Feature Construction with Version Spaces for Biochemical Applications
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Declarative Bias in Equation Discovery
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Data Mining as Constraint Logic Programming
Computational Logic: Logic Programming and Beyond, Essays in Honour of Robert A. Kowalski, Part II
Theory Revision in Equation Discovery
DS '01 Proceedings of the 4th International Conference on Discovery Science
Experiments in Predicting Biodegradability
ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Towards a general framework for data mining
KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Integer linear programming models for constrained clustering
DS'10 Proceedings of the 13th international conference on Discovery science
Hi-index | 0.00 |
Inductive databases (IDBs) contain both data and patterns. Inductive Queries (IQs) are used to access, generate and manipulate the patterns in the IDB. IQs are conjunctions of primitive constraints that have to be satisfied by target patterns: they can be different for different types of patterns. Constraint-based data mining algorithms are used to answer IQs. So far, mostly the problem of mining frequent patterns has been considered in the framework of IDBs: the types of patterns considered include frequent itemsets, episodes, Datalog queries, sequences, and molecular fragments. Here we consider the problem of constraint-based mining for predictive models, where the data mining task is regression and the models are polynomial equations. More specifically, we first define the pattern domain of polynomial equations. We then present a complete and a heuristic solver for this domain. We evaluate the use of the heuristic solver on standard regression problems and illustrate its use on a toy problem of reconstructing a biochemical reaction network. Finally, we consider the use of a combination of different pattern domains (molecular fragments and polynomial equations) for practical applications in modeling quantitative structure-activity relationships (QSARs).