Collecting Quality Data for Database Mining

  • Authors:
  • Chengqi Zhang;Shichao Zhang

  • Affiliations:
  • -;-

  • Venue:
  • AI '01 Proceedings of the 14th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data collecting is necessary to some organizations such as nuclear power plants and earthquake bureaus, which have very small databases. Traditional data collecting is to obtain necessary data from internal and external data-sources and join all data together to create a homogeneous huge database. Because collected data may be untrusty, it can disguise really useful patterns in data. In this paper, breaking away traditional data collecting mode that deals with internal and external data equally, we argue that the first step for utilizing external data is to identify quality data in data-sources for given mining tasks. Pre- and post-analysis techniques are thus advocated for generating quality data.