Foundations of logic programming; (2nd extended ed.)
Foundations of logic programming; (2nd extended ed.)
Logic programming and databases
Logic programming and databases
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficient parallel data mining for association rules
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
An effective hash-based algorithm for mining association rules
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Machine Learning
Communication-efficient distributed mining of association rules
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A fast distributed algorithm for mining association rules
DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Information Retrieval
Relational Data Mining
Levelwise Search and Borders of Theories in KnowledgeDiscovery
Data Mining and Knowledge Discovery
Parallel Mining of Association Rules
IEEE Transactions on Knowledge and Data Engineering
Data Mining the Yeast Genome in a Lazy Functional Language
PADL '03 Proceedings of the 5th International Symposium on Practical Aspects of Declarative Languages
Sampling Large Databases for Association Rules
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Mining Association Rules in Multiple Relations
ILP '97 Proceedings of the 7th International Workshop on Inductive Logic Programming
Evaluation of sampling for data mining of association rules
RIDE '97 Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications
Adaptive and Resource-Aware Mining of Frequent Sets
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Accelerating the Drug Design Process through Parallel Inductive Logic Programming Data Mining
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Query transformations for improving the efficiency of ilp systems
The Journal of Machine Learning Research
Scalability and efficiency in multi-relational data mining
ACM SIGKDD Explorations Newsletter
Inducing Multi-Level Association Rules from Multiple Relations
Machine Learning
Scalable Multi-Relational Association Mining
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
A high-performance distributed algorithm for mining association rules
Knowledge and Information Systems
Distributed approximate mining of frequent patterns
Proceedings of the 2005 ACM symposium on Applied computing
Everyware: The Dawning Age of Ubiquitous Computing
Everyware: The Dawning Age of Ubiquitous Computing
Toward knowledge-rich data mining
Data Mining and Knowledge Discovery
Querying and Merging Heterogeneous Data by Approximate Joins on Higher-Order Terms
ILP '08 Proceedings of the 18th international conference on Inductive Logic Programming
Strategies to parallelize ILP systems
ILP'05 Proceedings of the 15th international conference on Inductive Logic Programming
On enumerating frequent closed patterns with key in multi-relational data
DS'10 Proceedings of the 13th international conference on Discovery science
An adaptive algorithm for finding frequent sets in landmark windows
SUM'12 Proceedings of the 6th international conference on Scalable Uncertainty Management
Parallelizing the improved algorithm for frequent patterns mining problem
ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part I
Hi-index | 0.00 |
The amount of data produced by ubiquitous computing applications is quickly growing, due to the pervasive presence of small devices endowed with sensing, computing and communication capabilities. Heterogeneity and strong interdependence, which characterize 'ubiquitous data', require a (multi-)relational approach to their analysis. However, relational data mining algorithms do not scale well and very large data sets are hardly processable. In this paper we propose an extension of a relational algorithm for multi-level frequent pattern discovery, which resorts to data sampling and distributed computation in Grid environments, in order to overcome the computational limits of the original serial algorithm. The set of patterns discovered by the new algorithm approximates the set of exact solutions found by the serial algorithm. The quality of approximation depends on three parameters: the proportion of data in each sample, the minimum support thresholds and the number of samples in which a pattern has to be frequent in order to be considered globally frequent. Considering that the first two parameters are hardly controllable, we focus our investigation on the third one. Theoretically derived conclusions are also experimentally confirmed. Moreover, an additional application in the context of event log mining proves the viability of the proposed approach to relational frequent pattern mining from very large data sets.