Toward Multidatabase Mining: Identifying Relevant Databases

Authors:
Huan Liu;Hongjun Lu;Jun Yao
Affiliations:
-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2001

Citing 19
Cited 11

C4.5: programs for machine learning

C4.5: programs for machine learning
KDD–93: progress and challenges in knowledge discovery in databases

AI Magazine
Mining quantitative association rules in large relational tables

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Randomized algorithms

ACM Computing Surveys (CSUR)
From data mining to knowledge discovery: an overview

Advances in knowledge discovery and data mining
From contingency tables to various forms of knowledge in databases

Advances in knowledge discovery and data mining
Attribute-oriented induction in data mining

Advances in knowledge discovery and data mining
Selecting and reporting what is interesting

Advances in knowledge discovery and data mining
From data mining to knowledge discovery: current challenges and future directions

Advances in knowledge discovery and data mining
Feature Extraction, Construction and Selection: A Data Mining Perspective

Feature Extraction, Construction and Selection: A Data Mining Perspective
Feature Selection for Knowledge Discovery and Data Mining

Feature Selection for Knowledge Discovery and Data Mining
An Information Theoretic Approach to Rule Induction from Databases

IEEE Transactions on Knowledge and Data Engineering
Feature Selection via Discretization

IEEE Transactions on Knowledge and Data Engineering
The CN2 Induction Algorithm

Machine Learning
Data-Driven Discovery of Quantitative Rules in Relational Databases

IEEE Transactions on Knowledge and Data Engineering
Database Mining: A Performance Perspective

IEEE Transactions on Knowledge and Data Engineering
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Proposed interestingness measure for characteristic rules

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Post-analysis of learned rules

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Database classification for multi-database mining

Information Systems
CoLe: A Cooperative Data Mining Approach and Its Application to Early Diabetes Detection

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Synthesizing heavy association rules from different real data sources

Pattern Recognition Letters
Efficient clustering of databases induced by local patterns

Decision Support Systems
Data mining research for customer relationship management systems: a framework and analysis

International Journal of Business Information Systems
Modified algorithms for synthesizing high-frequency rules from different data sources

Knowledge and Information Systems
An Improved Database Classification Algorithm for Multi-database Mining

FAW '09 Proceedings of the 3d International Workshop on Frontiers in Algorithmics
Mining important association rules based on the RFMD technique

International Journal of Data Analysis Techniques and Strategies
A cooperative multi-agent data mining model and its application to medical data on diabetes

AIS-ADM 2005 Proceedings of the 2005 international conference on Autonomous Intelligent Systems: agents and Data Mining
Clustering local frequency items in multiple databases

Information Sciences: an International Journal
Improving the efficiency of distributed data mining using an adjustment work flow

MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

Various tools and systems for knowledge discovery and data mining are developed and available for applications. However, when we are immersed in heaps of databases, an immediate question is where we should start mining. It is not true that the more databases, the better for data mining. It is only true when the databases involved are relevant to a task at hand. In this paper, breaking away from the conventional data mining assumption that many databases be joined into one, we argue that the first step for multidatabase mining is to identify databases that are most likely relevant to an application; without doing so, the mining process can be lengthy, aimless, and ineffective. A measure of relevance is thus proposed for mining tasks with an objective of finding patterns or regularities about certain attributes. An efficient algorithm for identifying relevant databases is described. Experiments are conducted to verify the measure's performance and to exemplify its application.