Collecting Quality Data for Database Mining

Authors:
Chengqi Zhang;Shichao Zhang
Affiliations:
-;-
Venue:
AI '01 Proceedings of the 14th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Year:
2001

Citing 5
Cited 1

Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Feature Selection for Knowledge Discovery and Data Mining

Feature Selection for Knowledge Discovery and Data Mining
Induction By Attribute Elimination

IEEE Transactions on Knowledge and Data Engineering
Data Mining and Knowledge Discovery in Databases: Implications for Scientific Databases

SSDBM '97 Proceedings of the Ninth International Conference on Scientific and Statistical Database Management

Association rule mining: models and algorithms

Association rule mining: models and algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data collecting is necessary to some organizations such as nuclear power plants and earthquake bureaus, which have very small databases. Traditional data collecting is to obtain necessary data from internal and external data-sources and join all data together to create a homogeneous huge database. Because collected data may be untrusty, it can disguise really useful patterns in data. In this paper, breaking away traditional data collecting mode that deals with internal and external data equally, we argue that the first step for utilizing external data is to identify quality data in data-sources for given mining tasks. Pre- and post-analysis techniques are thus advocated for generating quality data.