A genetic algorithm-based clustering approach for database partitioning

  • Authors:
  • Chun-Hung Cheng;Wing-Kin Lee;Kam-Fai Wong

  • Affiliations:
  • Dept. of Syst. Eng. & Eng. Manage., Chinese Univ. of Hong Kong, Shatin;-;-

  • Venue:
  • IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In a typical distributed/parallel database system, a request mostly accesses a subset of the entire database. It is, therefore, natural to organize commonly accessed data together and to place them on nearby, preferably the same, machine(s)/site(s). For this reason, data partitioning and data allocation are performance critical issues in distributed database application design. We are dealing with data partitioning. Data partitioning requires the use of clustering. Although many clustering algorithms have been proposed, their performance has not been extensively studied. Moreover, the special problem structure in clustering is rarely exploited. We explore the use of a genetic search-based clustering algorithm for data partitioning to achieve high database retrieval performance. By formulating the underlying problem as a traveling salesman problem (TSP), we can take advantage of this particular structure. Three new operators for GAs are also proposed and experimental results indicate that they outperform other operators in solving the TSP. The proposed GA is applied to solve the data-partitioning problem. Our computational study shows that our GA performs well for this application.