Numerical recipes in C: the art of scientific computing
Numerical recipes in C: the art of scientific computing
C4.5: programs for machine learning
C4.5: programs for machine learning
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
CrossMine: Efficient Classification Across Multiple Database Relations
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Pruning Social Networks Using Structural Properties and Descriptive Attributes
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mining relational data through correlation-based multiple view validation
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Top-down induction of first-order logical decision trees
Artificial Intelligence
Detecting Irrelevant Subtrees to Improve Probabilistic Learning from Tree-structured Data
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Privacy leakage in multi-relational learning via unwanted classification models
Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
Reducing the size of databases for multirelational classification: a subgraph-based approach
Journal of Intelligent Information Systems
Hi-index | 0.00 |
Multirelational data mining methods discover patterns across multiple interlinked tables (relations) in a relational database. In many large organizations, such a multi-relational database spans numerous departments and/or subdivisions, which are involved in different aspects of the enterprise such as customer profiling, fraud detection, inventory management, financial management, and so on. When considering multirelational classification, it follows that these subdivisions will express different interests in the data, leading to the need to explore various subsets of relevant relations with high utility with respect to the target class. The paper presents a novel approach for pruning the uninteresting relations of a relational database where relations come from such different parties and spans many classification tasks. We aim to create a pruned structure and thus minimize predictive performance loss on the final classification model. Our method identifies a set of strongly uncorrelated subgraphs to use for training and discards all others. The experiments performed demonstrate that our strategy is able to significantly reduce the size of the relational schema without sacrificing predictive accuracy.