Database classification for multi-database mining

Authors:
Xindong Wu;Chengqi Zhang;Shichao Zhang
Affiliations:
Department of Computer Science, University of Vermont, Burlington, VT;Faculty of Information Technology, University of Technology, Sydney, P.O. Box 123, Broadway NSW 2007, Australia;Faculty of Information Technology, University of Technology, Sydney, P.O. Box 123, Broadway NSW 2007, Australia and State Key Laboratory of Intelligent Technology and Systems, Tsinghua University, ...
Venue:
Information Systems
Year:
2005

Citing 7
Cited 21

Parallel mining algorithms for generalized association rules with classification hierarchy

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Efficient Mining of Association Rules in Distributed Databases

IEEE Transactions on Knowledge and Data Engineering
Toward Multidatabase Mining: Identifying Relevant Databases

IEEE Transactions on Knowledge and Data Engineering
Synthesizing High-Frequency Rules from Different Data Sources

IEEE Transactions on Knowledge and Data Engineering
An Algorithm for Multi-relational Discovery of Subgroups

PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Peculiarity Oriented Multi-database Mining

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Identifying Relevant Databases for Multidatabase Mining

PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining

Sequential Pattern Mining in Multi-Databases via Multiple Alignment

Data Mining and Knowledge Discovery
Mining Adaptive Ratio Rules from Distributed Data Sources

Data Mining and Knowledge Discovery
A logical framework for identifying quality knowledge from different data sources

Decision Support Systems
Enhancing quality of knowledge synthesized from multi-database mining

Pattern Recognition Letters
Synthesizing heavy association rules from different real data sources

Pattern Recognition Letters
Mining association rules from imprecise ordinal data

Fuzzy Sets and Systems
Efficient clustering of databases induced by local patterns

Decision Support Systems
Capturing association among items in a database

Data & Knowledge Engineering
Mining fuzzy association rules from questionnaire data

Knowledge-Based Systems
Modified algorithms for synthesizing high-frequency rules from different data sources

Knowledge and Information Systems
Multirelational classification: a multiple view approach

Knowledge and Information Systems
An Improved Database Classification Algorithm for Multi-database Mining

FAW '09 Proceedings of the 3d International Workshop on Frontiers in Algorithmics
Mining important association rules based on the RFMD technique

International Journal of Data Analysis Techniques and Strategies
Measuring influence of an item in a database over time

Pattern Recognition Letters
Mining fuzzy association rules from uncertain data

Knowledge and Information Systems
Mining research topic-related influence between academia and industry

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Rule synthesizing from multiple related databases

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Data mining from multiple heterogeneous relational databases using decision tree classification

Pattern Recognition Letters
Clustering local frequency items in multiple databases

Information Sciences: an International Journal
A Framework for Synthesizing Arbitrary Boolean Queries Induced by Frequent Itemsets

International Journal of Knowledge-Based Organizations
Quality of information-based source assessment and selection

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many large organizations have multiple databases distributed in different branches, and therefore multidatabase mining is an important task for data mining. To reduce the search cost in the data from all databases, we need to identify which databases are most likely relevant to a data mining application. This is referred to as database selection. For real-world applications, database selection has to be carried out multiple times to identify relevant databases that meet different applications. In particular, a mining task may be without reference to any specific application. In this paper, we present an efficient approach for classifying multiple databases based on their similarity between each other. Our approach is application-independent.