Rule induction with CN2: some recent improvements
EWSL-91 Proceedings of the European working session on learning on Machine learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
Database Systems: The Complete Book
Database Systems: The Complete Book
Inductive Logic Programming: Techniques and Applications
Inductive Logic Programming: Techniques and Applications
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Scaling Up Inductive Logic Programming by Learning from Interpretations
Data Mining and Knowledge Discovery
Synthesizing High-Frequency Rules from Different Data Sources
IEEE Transactions on Knowledge and Data Engineering
ECML '93 Proceedings of the European Conference on Machine Learning
Top-Down Induction of Clustering Trees
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
RainForest - A Framework for Fast Decision Tree Construction of Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Identifying Relevant Databases for Multidatabase Mining
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
Learning relational probability trees
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Improving the efficiency of inductive logic programming through the use of query packs
Journal of Artificial Intelligence Research
Probabilistic classification and clustering in relational data
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
A rank algebra to support multimedia mining applications
Proceedings of the 8th international workshop on Multimedia data mining: (associated with the ACM SIGKDD 2007)
Integrating semantically heterogeneous aggregate views of distributed databases
Distributed and Parallel Databases
Supervised multi-class classification with adaptive and automatic parameter tuning
IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
Mining knowledge from databases: an information network analysis approach
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Constructing the Bayesian network structure from dependencies implied in multiple relational schemas
Expert Systems with Applications: An International Journal
A framework for relational link discovery
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Data mining from multiple heterogeneous relational databases using decision tree classification
Pattern Recognition Letters
Transforming graph data for statistical relational learning
Journal of Artificial Intelligence Research
Reducing the size of databases for multirelational classification: a subgraph-based approach
Journal of Intelligent Information Systems
Quality of information-based source assessment and selection
Neurocomputing
Genetic algorithm-based optimized association rule mining for multi-relational data
Intelligent Data Analysis
Hi-index | 0.00 |
Relational databases are the most popular repository for structured data, and is thus one of the richest sources of knowledge in the world. In a relational database, multiple relations are linked together via entity-relationship links. Multirelational classification is the procedure of building a classifier based on information stored in multiple relations and making predictions with it. Existing approaches of Inductive Logic Programming (recently, also known as Relational Mining) have proven effective with high accuracy in multirelational classification. Unfortunately, most of them suffer from scalability problems with regard to the number of relations in databases. In this paper, we propose a new approach, called CrossMine, which includes a set of novel and powerful methods for multirelational classification, including 1) tuple ID propagation, an efficient and flexible method for virtually joining relations, which enables convenient search among different relations, 2) new definitions for predicates and decision-tree nodes, which involve aggregated information to provide essential statistics for classification, and 3) a selective sampling method for improving scalability with regard to the number of tuples. Based on these techniques, we propose two scalable and accurate methods for multirelational classification: CrossMine-Rule, a rule-based method and CrossMine-Tree, a decision-tree-based method. Our comprehensive experiments on both real and synthetic data sets demonstrate the high scalability and accuracy of the CrossMine approach.