Estimating null values in relational database systems using automatic clustering and multiple regression techniques

  • Authors:
  • Shu-Ting Chang;Shyi-Ming Chen

  • Affiliations:
  • Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, ROC;Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, ROC and Department of Computer Science and Information Engineering ...

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 12.05

Visualization

Abstract

In this paper, we present a new method for estimating null values in relational database systems using automatic clustering and multiple regression techniques. First, we present a new automatic clustering algorithm for clustering numerical data. The proposed automatic clustering algorithm does not need to determine the number of clusters in advance and does not need to sort the data in the database in advance. Then, based on the proposed automatic clustering algorithm and multiple regression techniques, we present a new method to estimate null values in relational database systems. The proposed method estimating null values in relational database systems only needs to process a particular cluster instead of the whole database. It gets a higher average estimation accuracy rate than the existing methods for estimating null values in relational database systems.